memory error in matrix cosine_similarity
I have (20905040, 7)
of a dataset to recommend 10 different product to the user
it could be larger than that but anyway I got memory error when processing the
cosine_sim = cosine_similarity(normalized_df,normalized_df)
--------------------------------------------------------------------------- MemoryError Traceback (most recent call last) in 1 get_ipython().run_line_magic('time', '') ----> 2 cosine_sim = cosine_similarity(normalized_df,normalized_df)
~/venv/lib/python3.6/site-packages/sklearn/metrics/pairwise.py in cosine_similarity(X, Y, dense_output) 1034 1035 K = safe_sparse_dot(X_normalized, Y_normalized.T, -> 1036 dense_output=dense_output) 1037 1038 return K
~/venv/lib/python3.6/site-packages/sklearn/utils/extmath.py in safe_sparse_dot(a, b, dense_output) 140 return ret 141 else: --> 142 return np.dot(a, b) 143 144
MemoryError:
questions
1. when I have too much rows how do I apply cosine similarity?
2. do they talking about ram memory? or what memory error ?
3. is there way to use gpu for cosine similarity training?
4. any good idea?
Topic movielens cosine-distance scikit-learn recommender-system
Category Data Science