Can't load custom Keras metrics using mlflow.pyfunc

I have a DNN in Keras, which includes a custom metric function and which I want to pipeline with some SKlearn preprocessing. I further want to persist the model using MLFlow for easy deployment. The requirement to pipeline with sklearn means that I can't use the mlflow.keras versions of .log_model() and .load_model(), and have to instead use the mlflow.pyfunc versions, which is fine.

Saving the model seems to work fine, but when I try to use mlflow.pyfunc.load_model() to reimport the saved model I get this error message (full stack trace at link):

ValueError: Unknown metric function:custom_mse

To try and make sure that the custom function makes its way through to MLFlow I'm persisting it in a helper_functions.py file and passing that file to the code_path parameter of .log_model(), and then attempting to import that function in .load_context() before using keras.models.load_model() to reimport the saved keras model.

helper_functions.py:

import keras.backend as K

def custom_mse(y_true, y_pred):
    return K.mean((y_pred - y_true) ** 2)

The PythonModel I'm trying to persist is this:

class ProductRecommender(PythonModel):

    def __init__(self, pipeline):
        self.pipeline = pipeline

    def load_context(self, context):

        from helper_functions import custom_mse

        self.keras_model = keras.models.load_model(context.artifacts["keras_model"], custom_objects={'custom_mse', custom_mse})
        self.sklearn_preprocessor = joblib.load(context.artifacts["sklearn_preprocessor"])

        self.sklearn_model = KerasModelRegressor(self.keras_model, epochs=5, validation_split=0.2)

        self.pipeline = Pipeline(steps=[
            ('preprocessor', self.sklearn_preprocessor),
            ('estimator', self.sklearn_model)
        ])

    def fit(self, X, y):
        self.pipeline.fit(X, y)
        self.pipeline.named_steps.estimator.model.save('artifacts/keras_model.h5')
        joblib.dump(pr.pipeline.named_steps.preprocessor, 'artifacts/sklearn_preprocessor.joblib')

    def predict(self, context, X):
        return self.pipeline.predict(X)

Note I import custom_mse function from the helper_functions module and pass it as part of custom_objects to keras.models.load_model()

Here's the mlflow.pyfunc.log_model() call:

with mlflow.start_run() as run:

    run_id = run.info.run_id

    conda_env = {
        'name': 'mlflow-env',
        'channels': [
            'defaults',
            'anaconda',
            'conda-forge'
        ],
        'dependencies': [
            'python=3.7.0',
            'cloudpickle',
            'keras==2.2.5',
            'joblib==0.13.2',
            'scikit-learn==0.20.3'
        ]
    }

    artifacts = {
        'keras_model':'artifacts/keras_model.h5',
        'sklearn_preprocessor':'artifacts/sklearn_preprocessor.joblib'
    }

    mlflow.pyfunc.log_model(
        artifact_path='Model',
        code_path=['artifacts/example_sklearn_wrapper.py', 'artifacts/helper_functions.py'],
        python_model=pr,
        conda_env=conda_env,
        artifacts=artifacts
    )

What's happening here? Why isn't keras seeing my custom_mse function?

Topic mlflow keras python

Category Data Science


Long story short is to use cloudpickle instead of joblib or pickle to dump thing to disk, and this all works much more cleanly.

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.