How do I verify and test a machine learning model against reality during time?

As a software engineers we familiar with a concept of testing (unit, integration, e2e) Tests give us a level of confidence about the code and changes in our code. Looks like for ML the code is the data that was used for the model. And unfortunately data not so deterministic as source code.

If I consider that data is kind of code for ML:

  • What technics and tools cane be used for verifying / testing the data? My expectation is to have some tool like TFX for data validation, but more generic (for instance for PyTouch). But I didn't found any generic tools that automate/simplify this challenge.

  • If there is no generic and robust tool, is it worth to make some OSS project for it?

Thanks

Topic mlops cross-validation information-retrieval data-mining machine-learning

Category Data Science

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.