Store preprocessing function along with model in mlflow.keras

The following is a simplified code snipet that is relevant to storing keras LSTM models in MLFlow. with mlflow.start_run() as run: mlflow.keras.log_model(model,"lstm") mlflow.log_params(model_parameters) mlflow.log_metrics(model_metrics) However, suppose that for each model there is a corresponding data preprocessing function that need be applied to new data before prediction. processed_data = custom_processing_function(new_data) predictions = model.predict(processed_data) Because each model may have a different preprocessing function, I want to keep track of each pair of the form (preprocessing function, model). Ideally, I am looking for …
Category: Data Science

Multiple values for a single parameter in the mlflow run command

How to pass multiple values to each parameter in the mlflow run command? The objective is to pass a dictionary to GridSearchCV as a param_grid to perform cross validation. In my main code, I retrieve the command line parameters using argparse. And by adding nargs='+' in the add_argument(), I can write spaced values for each hyper parameter and then applying vars() to create the dictionary. See code below: import argparse # Build the parameters for the command-line param_names = list(RandomForestClassifier().get_params().keys()) …
Category: Data Science

MLflow real world experience

Can someone provide a summary of the real world deployment experience of MLflow? We have a few ML models (e.g., LightGBM, tensorflow v2, etc.) and want to avoid framework like SageMaker (due to customer requirement). So we are looking into various ways of hosting ML models for inferencing. Latency is one key performance metric that is very important to us. MLflow looks like a good choice for us. It would be greatly appreciated if the users of MLflow can share …
Topic: mlops mlflow
Category: Data Science

Can't load custom Keras metrics using mlflow.pyfunc

I have a DNN in Keras, which includes a custom metric function and which I want to pipeline with some SKlearn preprocessing. I further want to persist the model using MLFlow for easy deployment. The requirement to pipeline with sklearn means that I can't use the mlflow.keras versions of .log_model() and .load_model(), and have to instead use the mlflow.pyfunc versions, which is fine. Saving the model seems to work fine, but when I try to use mlflow.pyfunc.load_model() to reimport the …
Category: Data Science

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.