In case that the number of items is quite small, turning the problem into a classification problem will be the most convenient solution. Use each item as a feature, the class as a concept. Now you can apply the methods that @OvisAmmon recommended on or other classification methods.
However, usually in market basket analysis there are many items and each user buy few of them. That leads to a very sparse matrix and most classification methods have problems dealing with them.
A common method to analyze market basket data is using
Association rules which are implemented by first locating frequent item sets.
You can use association rules on order to learn on market basket in the following way:
- Add item for every class
- Add the owner class to its basket (e.g, transform {milk, beer} basket of a student in to the basket {mlik, beer, student}
- Use a frequent item set algorithm to find them
- Remove all frequent item sets that don't have a class member in them. Such sets can't help deduce the class.
- From the frequent items set find rules that imply the classes (with good enough support and confidence
- Use the these rules that match a basket in order to predict class
You should note some points:
Some association rules with different implication might apply to the same basket. You might have the rules milk -> X and beer -Y and the basket {milk, beer}. If the set {milk, beer} is frequent enough you might have a rule specifically to it. Otherwise you'll need a met rule in order to base by the rules. Such rules might be taking the rule with the highest confidence or the most likely class.
Association rules are based only on the existence of item, not on the lack of them. If students never buy diapers, you will not find such association rule. You can add "no diapers" as an item to your data set but that will turn it from a sparse one to a full one and the storage might be unfeasible.
The complexity of association rules is exponential. The is simply since the output it self is exponential , as can be noted when you have a Clique in the data.
Usually people don't try to find all association rules due to that problem, which is the practical thing to do.
However, it means that there might be association rules in your data that you won't find.