A/B test results contradictory with offline machine learning model performance

Question

A/B test results contradictory with offline machine learning model performance

CathyQian

2022年4月21日 23:00

This seems to be a common problem when bringing machine learning models to production.

Let's say we have an optimized machine learning model which gives decent performance metric in the unseen testing dataset. We are quite satisfied with that, and decided to bring the model online. Then we use A/B test to compare our website performance (i.e., revenue, customer engagement etc) with and without the new model. Somehow, our new model is not a clear winner or even a clear loser in the A/B test. How do we deal with such situation?

Here the model I mentioned is a machine learning model, for example ranking algorithm or a recommendation algorithm, but can be any algorithm in reality. Thanks for any help!

Topic ab-test machine-learning

Category Data Science

Brian Spiering · Accepted Answer · 2022年3月19日 23:25

One way to deal with the situation is to investigate the differences between the training and A/B testing. Here a couple of common differences:

The modeling training process optimizes a machine learning loss function. A/B test optimizes a business value. The loss function and business value could diverge.
Data distributions are different. The machine learning model is trained on older data. The A/B test is on newer data. The older and newer data come from different distributions.

A/B test results contradictory with offline machine learning model performance

About