Skip to content

generalise sobol-gridsearch to work with arbitrary data structures #27

@behrica

Description

@behrica

This opens the path to grid search over pipeline definitions.

Specially it should work to transform this:

[[:ds/select-columns [:Text :Score]]
 [:ds/update-column :Score #(map dec %)]
 [:ds-mod/set-inference-target :Score]
 [:nlp/count-vectorize :Text :bow :nlp/default-text->bow {:stopwords(ml-gs/categorical [nil :default :google :comprehensive]) }]
 [:nb/bow->SparseArray :bow :bow-sparse {:vocab-size (ml-gs/linear 100 10000)}]
 [:ml/train {:model-type   :discrete-naive-bayes
             :discrete-naive-bayes-model :multinomial
             :sparse-column :bow-sparse
             :nb-model-hyper-parameter-x (ml-gs/linear 0.0 1.0)
             }]
]

into a list of "copies" of this data structure, in which the gridsearch definitions are replaced by concrete values.

See discussion here: https://clojurians.zulipchat.com/#narrow/stream/236259-tech.2Eml.2Edataset.2Edev/topic/couple.20tech.2Eml.20to.20tablecloth.20pipeline.20concept.20.3F

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions