How to split train/test in recommender systems
I am working with the MovieLens10M dataset, predicting user ratings. If I want to fairly evaluate my algorithm, how should I split my training v. test data?
By default, I believe the data is split into train v. test sets where 'test' contains movies previously unseen in the training set. If my model requires each movie to have been seen at least once in the training set, how should I split my data? Should I take all but N of each user's ratings for all the data and evaluate my performance on the held out NxUser_num ratings?
Topic dataset recommender-system machine-learning
Category Data Science