Spark ALS-WR giving the same recommended items for all users

We are trying to build a recommendation system for a supermarket with diverse item types (ranging from fast-moving grocery to low-moving electronic items). Some items are purchased more frequently in high volume and some items are purchased only once.

We have purchase history data of 4 months from 25K+ customers across 30K+ SKU's from 100+ departments. We ran ALS-WR in Spark to generate recommendations. To our surprise, we are receiving top 15 recommendations for each customer quite generic without much variation.

We have tried several means to diversify the recommendations as below- - computed "rating" = normalized # of purchase - computed "rating" = log of # of purchase - computed "rating" = 1 (if purchase # > 1) - We have used following combination of parameters - lambda = 0.01 to 300, alpha = 5 to 50, rank = 10, 20, 30 and # of iterations = 10, 20 - preference considered is explicit.

Do you think that ALS can be used for such heterogeneous data? If yes, what modifications will make the recommendations diverse personalized?

Topic apache-spark recommender-system machine-learning

Category Data Science


No - alternating least squares (ALS) is not designed for heterogeneous data.

One option for building a recommendation system for heterogeneous data is to bin the items into more common and less common items. Then build a separate model for each of the bins. Then the results will not be dominated by common, generic items. Then a diverse and personalized results page can be created by taking a weighted sample from each of the bins.

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.