Association Rule Mining across two market baskets

I am quite familiar with Association Rule mining but I need to use it to associate ACROSS two market baskets instead of finding support WITHIN a market basket.

Imagine customers come to a Store A and buy a certain number of products. The same customers go to Store B and buy another set of products. I want to associate between the two Stores and not within the Store.

So I want to make "A --> B" statements like

"Customers that Bought x and y from Store A also purchased z from Store B"

I could lump all the purchases as one market basket and run some association mining algo on it but in that case the association algo will not control for where the items were purchased.

Clearly one alternative is to find all rules using one market basket and then exclude the ones where the A isn't pure in the A -> B relationship.

Any other ideas would be great.

Topic market-basket-analysis association-rules text-mining machine-learning

Category Data Science


Option 1

What a good option can be is to group those purchases into individual itemsets and explore by customer. Consider the following example:

Customer 001 goes to store A and buys: {Banana, Milk}. Then he goes to store B and buys {Eggs, Oranges}. Next, call {Banana, Milk} as product 01 and {Eggs, Orange} as product 001.

Customer 002 goes to store A and buys: {Banana, Milk}. Then he goes to store B and buys {Apple, Watermelon}. Now, using the previous customer's data, you see that {Banana, Milk} already exists and you also call it product 01, whereas {Apple, Watermelon} is a new product that you call product 002.

You perform this for all of the itemsets for all customers.

In the above example, your Customer 001 total receipt includes: product 01, product 001. Your Customer 002 total receipt includes: product 01, product 002.

Note: product with one leading zero (01) comes from store A, product with two leading zeros (001, 002) comes from store B.

Now you can perform the regular Market Basket Analysis using association rules in Python and get the rules you are looking for by identifying store products with leading zeros.

Option 2

Your other option is to encode products before hand. Using the above example, Banana and Milk can be coded as 1A, 2A (from store A). And Eggs, Orange, Apple, Watermelon can be coded 1B, 2B, 3B, 4B (from store B).

Then you combine it all in one dataset, and create a unique receipt number including all items so they are on one transaction. For Customer 001, it will be {1A, 2A, 1B, 2B}, for Customer 002, it will be {1A, 2A, 3B, 4B}.

Now you can perform the regular Market Basket Analysis using association rules in Python and get the rules you are looking for by identifying store products by character A or B to find in which store a customer made a purchase.

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.