Detect Missing Records in Dataset
I have a dataset that contains several measures from various widgets on a daily basis. While the widgets remain relatively stable over time, sometimes there are legitimate reasons for one to disappear and another to appear in the data as a whole. Occasionally, a widget will just disappear and so the dataset is incomplete, invalidating the whole dataset for that day.
What I am looking for is a method of comparing the current set of widgets with another set of widgets to detect if any widgets are missing. I am not trying to create the values, just identify that they are missing. I could do time-series, but that feels like overkill on so many widgets and there are multiple attributes on which data might be missing. I was hoping for something more set based that might account for the regular changes in widgets but detecting the unusual dropouts. I am sure I just need to adjust the way I am thinking about the problem.
Any ideas would be much appreciated?
Topic missing-data time-series machine-learning
Category Data Science