Aggregating transactional data for customer segmentation

Question

Aggregating transactional data for customer segmentation

user1636588

2021年11月12日 15:39

I have item-level transactional data where each row in the data represents a different item bought by a customer in a transaction (so if two different items were bought in the same transaction by the same customer there would be two rows where the customer_id and the transaction_id columns have the same value)

Eg:

Customer_id	transaction_id	item_bought	quantity
a	00001	cheese	2
b	00002	ham	1
b	00002	pepsi	2

In this case customer b bought two items in the same transaction so there are two rows with the same value in both customer_id and transaction_id columns.

I want to be able to cluster customers based on the sorts of items that they buy and other factors such as the time of day that they purchase items.

To do this do I have to aggregate the data so that each customer is represented by a single row or is it possible to set up my model in such a way that I don't have to aggregate the data? My concern is that I would like to be able to look at behvaiour on a transaction_id level too (e.g. this customer always buys a coffee in the morning and a pizza at night) and if I aggregate the data to customer_id level then I'll lose that detail.

Topic machine-learning-model aggregation data-cleaning clustering machine-learning

Category Data Science

Aggregating transactional data for customer segmentation

About