Temporal train test split for recommender systems

When evaluating a collaborative filtering recommender system, it is practical to split the data temporally. However, by doing so, some users might be present in only either of the train or test set. For example, consider the below example:

user  year
0     2020
0     2020
0     2021
1     2021
1     2021
1     2021
2     2020
2     2021
2     2021

If we decide to split by year such that ratings after 2020 will be in the test set, then:

Train           
user  year
0     2020
0     2020
2     2020

Test
user  year
0     2021
1     2021
1     2021
1     2021
2     2021
2     2021

This means that user 1 will not be in the train set at all. When using matrix factorization/latent models, since user 1 is not in the train set, when we multiply the latent factors U and V to get back the predicted rating matrix, user 1 will not be in there at all, and thus we will not be able to predict the ratings for user 1. This applies to items as well, although it is not shown here.

How does one deal with that? Does one simply remove users that are not in the train set from the test set? Wouldn't this lead to a lot of data wastage?

Topic matrix-factorisation training recommender-system machine-learning

Category Data Science


What I have previously done, and what have worked well in my domain (retail), is to first get a sense of the distribution of your user/rating frequencies. More specifically, say you want a test-set consisting of 10% of the observations, then figure out the threshold (n) where, if you take all users with at least n interactions, and their latest interaction(s), you end up with 10%-ish datapoints.

In your case, perhaps some fraction of the users have at least 5 interactions (ratings), then take the latest interaction(s) these users have done, and use that as your test-set (use more or less than 5 to change the size of your test-set). This way, there exists no users in your test-set that are not a part of your training-set, although, the users with very few interactions will not be represented in the test-set. This train/test-skew could also pose a problem, but not being able to provide recommendations for these users, I have found, pose an even larger problem.

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.