There is all kinds of solutions right now. Mainly you can divide it into two:
- Monitoring features as part of a bigger AI platform
- A dedicated monitoring solution
Few factors to examine before choosing between the two options:
- What is your scale of your usage in ML models?
- What's the impact of your models? are they part of your core business or is it only enrichment \ niche of your business?
- What is your DS team size?
- How many platforms do you use to deploy models to production? Do you have only one standard way to deploy?
The general theme is, the bigger the ml operation is, and the more you need it to be agnostic to the deployed platform, go for a dedicated solution.
If your ml operations are still very limited and your serving platform already has few monitoring features in place, so it might good enough for you for now.
When examining a specific solution, consider the following points:
Integration - How complicated is it?
Measurement - Does it offer both data (input \ inference \ label) stability measurement?
Performance analysis - Does it provide you the ability to close the loop and see performance analytics (BTW... in most cases, even if you can get performance metrics, you probably won't be able to base your monitoring on top of it, cause in reality such performance information usually available only with delay time after the predictions were made).
Resolution - Can the system detect and measure such metrics on a higher resolution? (sub-segments of your entire datasets)? In many cases, drift or technical issues will occur only for a specific subset of your data.
Alerts - Does the solution include also a statistical alert mechanism? Eventually, it's hard to track all the KPIs mentioned above, and every dataset behaves differently, so thresholds are hard to define.
Dashboard - Does the solution contain a clear UI dashboard?
API - Can you consume such production insights directly from API? It can be very beneficial to build automation on top of it.
BTW... Here is a blog post, I wrote, talking about the different elements that should be converted when monitoring ml and reviewing current solutions