Market Basket Analysis - Data Modelling
Imagine that I've the following dataset:
Customer_ID Product_Desc
1 Jeans
1 T-Shirt
1 Food
2 Jeans
2 Food
2 Nightdress
2 T-Shirt
2 Hat
3 Jeans
3 Food
4 Food
4 Water
5 Water
5 Food
5 Beer
I need to make the consumer behaviour and predicte what products are associated. For do that I think that will a good strategy make the relationships first and then count the occurrences (don't know if anyone have a better idea).
The first step is to conclude this relationships:
Jeans-T-Shirt-Food
Jeans-Food-Nightdress-T-Shirt-Hat
Jeans-Food
Food-Water
Water-Food-Beer
How can do this? With Apache PIG or with Spark?
Many thanks!!!
Topic market-basket-analysis scala apache-spark
Category Data Science