Implementing Frequently bought together using a DB

We have a classic structure of an online shop database (products, customers, sales) and we want to implement a Frequently bought together feature. Our software is in ASP.NET and we do not know PHP to reverse engineer how this is being done in Magento.

And all we need is a simple Frequently bought together (not with discounts like Magento offers).

I understand that this is machine learning and one of the more common ways is Jaccard coefficient. Is that the recommended way?

My main question however is what the View I create in the DB should look like? What is the structure of the table that Jaccard coefficient and other relevant tests require so that used best?

Topic jaccard-coefficient databases machine-learning

Category Data Science


It is called Association Rule mining. You can implement a very basic version of this algorithm easily which works fine for small or mid size datasets. What this algorithm does is basically finding the most frequent items in all the transactions. In your example, you can find the frequent items purchased together by customers in your online store. I implemented one in .Net for a course it just took me 2 days to understand and code it.

The format of the data really depends on the implementation, for instance, my algorithm reads data in a text file that each line of the file is a tuple of items:

1,2,3
4,3,5
6,1,4,5,1
7,6,5

Where each line is a transaction, and the numbers can be the items purchased by a user at one checkout. The algorithm needs a number which is called the threshold. It means the algorithm finds the patterns of items that repeated X times (it can be a percentage) in all of the transactions. The result can be a couple of rows that each is the combination of items that were bought together in a transaction and managed to satisfy the threshold.

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.