How to aggregate data inserted by users to avoid outliers?
I'm developing a new application based on machine learning. In this application users can insert new data to improve the prediction system. As you may guess, users could insert data that doesn't make sense, generating in this way outliers that may harm the prediction accuracy. I'm pretty new to this field so I would like to ask you: do you know any strategy to mitigate this? Maybe by implementing a voting or aggregating system? In that case, do you have any hint, our could you please direct me to some theoretical topics regarding this?
Topic aggregation data outlier
Category Data Science