What's the best way to validate a rare event detection model during training?

When training a deep model for rare event detection (e.g. sound of an alarm in a home device audio stream), is it best to use a balanced validation set (50% alarm, 50% normal) to determine early stopping etc., or a validation set representative of reality? If an unbalanced, realistic validation set is used it may have to be huge to contain only a few positive event examples, so I'm wondering how this is typically dealt with.

In the given example of alarm sound detection, false negatives are obviously costly, but I imagine false positives still have an equal cost, because the event is so rare in reality that even a very low false positive rate could still correspond to low precision. Also, to me anomaly detection doesn't seem very applicable in this example because of the open set nature of the problem, where the "normal state" of the audio stream isn't clearly defined (i.e. there could be many unforeseen noises/sounds aside from alarms).

If anyone has insight in this area I'd greatly appreciate it!

Topic audio-recognition anomaly-detection class-imbalance deep-learning

Category Data Science


That is an empirical question that could be answered through hold-out datasets. Create the different scenarios and see which one the model performs better in.

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.