movielens

Dealing with missing data in SVD

David

2022年5月18日 08:00

I am a newbie to machine learning and I am trying to apply the SVD on the movielens dataset for movie recommendation. I have a movie-user matrix where the row is the user id, the column is the movie id and the value is the rating. Now, I would like to perform normalization on the movie-user matrix (subtract the data by users ratings mean). Then pass the normalized matrix to Scipy.sparse svds as follow: from scipy.sparse.linalg import svds U, sigma, …

Topic: movielens missing-data machine-learning

Category: Data Science

How to estimate missing values when calculating NDCG

Michał Misiewicz

2022年4月30日 19:00

I would like to compare recommendations methods using NDCG metric on MovieLens dataset. In ranking problem, the goal is to rank items based on their relevance for user. Ranking models can be learned based on ratings matrix, where each user rates small subset of all items. Ratings for other items are unknown. Collaborative Filtering methods can be used to create model which generalize training datasets and predict ratings for unrated items. Let's consider following example on dataset consisted of 5 …

Topic: movielens ndcg evaluation recommender-system

Category: Data Science

Dot product of two matrices in NLP how can i get this error be solved

Khurram Sarfraz Abbasi

2021年5月19日 18:55

from sklearn.metrics.pairwise import linear_kernel sim_matrix = linear_kernel(tfidf_matrix, tfidf_matrix) when I try to get dot product I am getting this errro MemoryError Traceback (most recent call last) <ipython-input-19-2c4d43d4a89e> in <module> 1 from sklearn.metrics.pairwise import linear_kernel ----> 2 sim_matrix = linear_kernel(tfidf_matrix, tfidf_matrix) ~\anaconda3\lib\site-packages\sklearn\metrics\pairwise.py in linear_kernel(X, Y, dense_output) 1002 """ 1003 X, Y = check_pairwise_arrays(X, Y) -> 1004 return safe_sparse_dot(X, Y.T, dense_output=dense_output) 1005 1006 ~\anaconda3\lib\site-packages\sklearn\utils\validation.py in inner_f(*args, **kwargs) 70 FutureWarning) 71 kwargs.update({k: arg for k, arg in zip(sig.parameters, args)}) ---> 72 return f(**kwargs) …

Topic: movielens scikit-learn recommender-system

Category: Data Science

memory error in matrix cosine_similarity

jake Monk

2020年8月24日 05:03

I have (20905040, 7) of a dataset to recommend 10 different product to the user it could be larger than that but anyway I got memory error when processing the cosine_sim = cosine_similarity(normalized_df,normalized_df) --------------------------------------------------------------------------- MemoryError Traceback (most recent call last) in 1 get_ipython().run_line_magic('time', '') ----> 2 cosine_sim = cosine_similarity(normalized_df,normalized_df) ~/venv/lib/python3.6/site-packages/sklearn/metrics/pairwise.py in cosine_similarity(X, Y, dense_output) 1034 1035 K = safe_sparse_dot(X_normalized, Y_normalized.T, -> 1036 dense_output=dense_output) 1037 1038 return K ~/venv/lib/python3.6/site-packages/sklearn/utils/extmath.py in safe_sparse_dot(a, b, dense_output) 140 return ret 141 else: --> 142 return …

Topic: movielens cosine-distance scikit-learn recommender-system

Category: Data Science

What is the current state of the art solution for Movielens 100k / 20M?

Martin Thoma

2019年7月1日 05:35

I found Basic recommendation system for Movilens dataset using Keras which has a solution which works ok (MAE 0.84). What is the current state of the art for this dataset?

Topic: movielens recommender-system

Category: Data Science

Dealing with missing data in SVD

How to estimate missing values when calculating NDCG

Dot product of two matrices in NLP how can i get this error be solved

memory error in matrix cosine_similarity

What is the current state of the art solution for Movielens 100k / 20M?

About