Reusable parameter scans wrapper

In most of my projects, I come up with models and want to visualize how some property $x$ varies as a function of a subset of parameters $p_1$,$p_2$, .. etc.

So I'll often end up with figures of the "parameter scan" which look like this

Those are very helpful for explaining a model / process / datasets.

The problem is: I put an inordinate amount of work into producing the data necessary to generate these figures. Most of it wasted on the pipeline itself. I often have to couple anything from 2 to 10 mismatched programs, deal with thousands of intermediary files, track the parameters in the file name along the way, etc. It quickly lead to overwhelming bash scripts.

Is there any principled way to go about managing a pipeline of mismatched programs, with parameter scans in mind? Say a python module where the user can declare wrappers and how they interact together. Then the user just "launch the pipeline" on every point of the parameter space and recover the data in easily potable format array?

Topic tools

Category Data Science


It sounds like experiment tracking. ML flow is one Python package for experiment tracking.

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.