Is it a best practice to exclude retweets from the data set?
I am going to build machine learning algorithm to identify fake tweets. The data set has huge retweets which I think might be an issue. Do you think given that the focus is the original tweet, it is better to remove all the retweets?
Thank you,
Topic supervised-learning pandas python machine-learning
Category Data Science