mlops

What is MLOPS - Why & How to get started?

Ashwiniku918

2022年5月26日 16:25

MLOPs has been gaining traction and a lot of fortune 500 companies are creating team specifically for MLOPs.Can anyone help me understand? Why MLOPs is gaining so much traction? How it is different from DevOps? What are the tools used for MLOPs? How can i create a strategy for MLOPs? How to get started?

Topic: mlops machine-learning

Category: Data Science

Meaningfully compare target vs observed TPR & FPR

Alexandru Dinu

2022年4月28日 10:26

Suppose I have a binary classifier $f$ which acts on an input $x$. Given a threshold $t$, the predicted binary output is defined as: $$ \widehat{y} = \begin{cases} 1, & f(x) \geq t \\ 0, & f(x) < t \end{cases} $$ I then compute the $TPR$ (true positive rate) and $FPR$ (false positive rate) metrics on the hold-out test set (call it $S_1$): $TPR_{S_1} = \Pr(\widehat{y} = 1 | y = 1, S_1)$ $FPR_{S_1} = \Pr(\widehat{y} = 1 | y …

Topic: binary-classification model-evaluations mlops

Category: Data Science

How is model evaluation and re-training done after deployment without ground truth labels?

sangstar

2022年4月1日 00:00

Suppose I deployed a model by manual labeling the ground truth labels with my training data, as the use case is such that there's no way to get the ground truth labels without humans. Once the model is deployed, if I wanted to evaluate how the model is doing on live data, how can I evaluate it without sampling some of that live data, that doesn't come with ground truth labels, and manually giving it the ground truth labels? And …

Topic: model-evaluations mlops training

Category: Data Science

Time varying (dynamic) data sets for practice

Mohammad Nur

2022年1月28日 16:17

I just finished some courses on machine learning in production that teaches the concepts and techniques used in training, serving and maintaining ml models. Now I want to do my own project the apply the things I learn. I need a dynamic structured dataset that changes over time to make the need for large scale deployment and induce model decay. I am also thinking about simulating data and concept drift with some noise so if anyone has something that helps …

Topic: mlops machine-learning

Category: Data Science

Data preprocessing framework/library alternatives

neondot42

2022年1月5日 16:54

I am currently working on some python machine learning projects that are soon to be deployed to production. As such, in our team we are interested in doing this the most "correct" way, following MLOps principles. Specifically, I am currently researching the step of data preprocessing and how to implement it in a robust way against training-serving skew. I've considered Tensorflow Transform, that after a single run of some defined preprocessing steps, generates a graph artifact that can be reused …

Topic: mlops tensorflow preprocessing python machine-learning

Category: Data Science

Is automated load testing best practise for MlOps

Daniel Wyatt

2021年12月6日 16:50

I have an an ml ops pipeline setup to automatically update a machine learning model in production. The general pipeline steps are: Run unit tests Preprocess training data Train model Evaluate model Deploy model to beta endpoint Run Integration tests Deploy model to production endpoint (manually approved step) My question is: Is it best practise to automatically perform some kind of load testing of the endpoint after step 5? By load testing I mean making sure that the endpoint will …

Topic: mlops

Category: Data Science

Which tool do you use for creating continuous training pipelines for MLOPS?

GeorgeOfTheRF

2021年10月16日 15:32

One key component of MLOPS is continuous training. Which means the end to end training is put in a pipeline which can be triggered, versioned and metadata of the pipeline can be tracked. Thus enabling retraining of the model without lots of manual efforts. Which tool/package do you use for creating such a training pipeline? I am looking for simple tool with following criteria Using only python (i.e. docker is not mandatory like kubeflow) Dev have lot of flexibility and …

Topic: mlops training python machine-learning

Category: Data Science

How to deal with errors or inconsistencies in the training data?

Lerner Zhang

2021年9月14日 04:52

There are inconsistant wrong labels and consistant errors in training data. For the former I tried MC-dropout and data Shapley. For the later I wonder if manual data curation is a requisite?

Topic: mlops training data-cleaning

Category: Data Science

How do I verify and test a machine learning model against reality during time?

BogdanSnisar

2021年6月28日 11:07

As a software engineers we familiar with a concept of testing (unit, integration, e2e) Tests give us a level of confidence about the code and changes in our code. Looks like for ML the "code" is the data that was used for the model. And unfortunately data not so deterministic as source code. If I consider that data is kind of code for ML: What technics and tools cane be used for verifying / testing the data? My expectation is …

Topic: mlops cross-validation information-retrieval data-mining machine-learning

Category: Data Science

Is my idea of a Feature Store wrong?

Pouya Barrach-Yousefi

2021年3月22日 22:01

Cross-posted on Reddit ML. Should a Feature Store be part of an enterprise data catalog? To me, a feature store seems to be a highly niche data catalog but missing a lot of the benefits of having an enterprise data catalog / data discovery tool. My need is to have generated features discoverable when searching for data. For example, if I have dataset A and B used to generate a feature set AB', I would want to know about that …

Topic: mlops feature-engineering

Category: Data Science

What features used by CNN model should a feature store actually store?

GeorgeOfTheRF

2021年2月10日 08:26

According to MLOPs principle, it is recommended to have a feature store. The question is in the context of doing image classification using deep learning models like convolutional neural networks which does automatic feature engineering(using convolution layers) as part of the training process. Questions Does it make sense to have a feature store for pure image classification/segmentation models? What features should be stored in the feature store? Output of convolution layers? but then they cannot be reused during the training …

Topic: mlops feature-engineering convolution image-classification deep-learning

Category: Data Science

MLflow real world experience

David293836

2020年12月4日 07:06

Can someone provide a summary of the real world deployment experience of MLflow? We have a few ML models (e.g., LightGBM, tensorflow v2, etc.) and want to avoid framework like SageMaker (due to customer requirement). So we are looking into various ways of hosting ML models for inferencing. Latency is one key performance metric that is very important to us. MLflow looks like a good choice for us. It would be greatly appreciated if the users of MLflow can share …

Topic: mlops mlflow

Category: Data Science

About