kMean clustering for recommendation

I have a file with 50000 rows from a library platform. Each individual row saves a user, and shows the order in which the user, has selected. The books could be from various categories (e.g. roman, history, etc..). There are a total of 10 categories. The categories that user has selected could be for example: 334664. This means this user has selected a book from categories 3, 4 and 6. How can I use this data to build a recommendation system using the k-means cluster algorithm. If anybody can help me how I can go through the whole process step by step.

Topic beginner algorithms k-means clustering data-mining

Category Data Science


Since your data is sequential, you could try with sequential models (LSTM, RNN, GRU) etc. With which you can you predict what the user will select after the set of books as recommendation. In this way the input sequence length can be anything. (like, 3345, 33456, 334567 etc).

But to answer your question with KMeans. I assume, all the rows are in same length.

  1. Cluster all data with KMeans
  2. Optimize number of K with Elbow method.
  3. For each cluster take UNIQUE books.
  4. Foe each User, a. Predict Cluster for the User b. From the Cluster's Unique books, Find books which are not the user already read. (Cluster books - User books). c. You can suggest these books to the User.

This is just one the starting method to do it. Depends upon the business you can find many alternatives.

One such I can suggest is create similarity embeddings for each books, and suggest most similar (with distance measures) book to the user.

Hope this helps.

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.