Train-Test split for a recommender system
In all implementations of recommender systems I've seen so far, the train-test split is performed in this manner:
+------+------+--------+
| user | item | rating |
+------+------+--------+
| u1   | i1   |    2.3 |
| u2   | i2   |    5.3 |
| u1   | i4   |    1.0 |
| u3   | i5   |    1.6 |
| ...  | ...  |    ... |
+------+------+--------+
This is transformed into a rating matrix of the form:
+------+-------+-------+-------+-------+-------+-----+
| user | item1 | item2 | item3 | item4 | item5 | ... |
+------+-------+-------+-------+-------+-------+-----+
| u1   | 2.3   | 1.7   |   0.5 |   1.0 | NaN   | ... |
| u2   | NaN   | 5.3   |   1.0 |   0.2 | 4.3   | ... |
| u3   | NaN   | NaN   |   2.1 |   1.3 | 1.6   | ... |
| ...  | ...   | ...   |   ... |   ... | ...   | ... |
+------+-------+-------+-------+-------+-------+-----+
where NaN corresponds to the situation where a user has not rated that particular item.
Now, from each row (user) of the matrix, a certain percentage of the numeric (non-NaN) values are removed and set aside into a new matrix, representing the test set. The model is then trained on the initial matrix, with test samples removed, and the goal of the recommender is to fill-in the missing values, with the smallest possible error.
My question is, can the train-test split be somehow done user-wise? For example to keep a set of users separate, train the recommender on the rest of the user set and then try to predict the ratings for the new users? I know this goes a bit against the idea that "if a recommender does not know you, it cannot recommend something you like", but I am wondering if some k-NN can be done.
Topic software-recommendation dataset recommender-system
Category Data Science