How can deep learning be applied to association rule mining?

Association rule mining is considered to be an old technique of AI. Rules are mined on statistical support. How can deep learning be applied to this? What are approaches for structured data (in a graph format like XML)? XML documents are structured by tags. My goal is to extract a rule that says that tag x is often combined with tag y and z. Then, I later want to apply these rules and if a tag y and z is …
Category: Data Science

Which data mining or machine learning algorithm would be appropriate for learning ordered frequent patterns?

I have a dataset as (var1, var2, out), where the ordered pair <var1, var2> gives out. Most of the frequent pattern mining algorithms like the Apriori and FP growth algorithms does not preserve the order of var1 and var2. Which are some of the available pattern mining algorithms (may also be a NN trick), to find association rules between ordered pair <var1, var2> and output variable out? Thanks.
Category: Data Science

How to create group IDs for people in longitudinal data

I have a large data set which contains individuals and the address where they live. I want to create a group ID based off shared addresses (the working idea: people who share the same address can be considered as part of the same family/household). And from that household ID, my PI wants to investigate households/families migration overtime due to cost of living increases/decreases. However, the difficulty is the dataset/analysis is longitudinal. So we have this data set spanning multiple consecutive …
Category: Data Science

Association Rule Mining across two market baskets

I am quite familiar with Association Rule mining but I need to use it to associate ACROSS two market baskets instead of finding support WITHIN a market basket. Imagine customers come to a Store A and buy a certain number of products. The same customers go to Store B and buy another set of products. I want to associate between the two Stores and not within the Store. So I want to make "A --> B" statements like "Customers that …
Category: Data Science

What is the Zhang's rule?

I'd been doing some reading on Association Rule Mining and bumped into a Kaggle dataset where a competitor had applied Zhang's rule. I would like to know what it is. I tried to look for it online, and most of the hits revolve around some Chinese emperor by that name whol ruled China. And the other things arn't really relevant. If there is anything that you can share about it, like its significance that'd be great. There's also no tag …
Category: Data Science

Association rules for classification

I'm working on a classification project. I have many rows, containing many binary attributes, some of which are often appearing together, exactly like what we can encounter in the Market Basket problem (in which you can, for example identify, that if you buy 'Milk' to a supermarket, you also have a more than random chance to buy 'Eggs'). My idea is then to take my target as an attribute, extract best Item-set containing my target (so having Target=1, exactly like …
Category: Data Science

Find optimal feature combinations and ordering for a multi-class clasification problem

We have a multi-class classification problem where the training data looks as follows: name A B C brand Snickers Ltd company huge sales Snickers Acme Intl office stationary commercial Acme Davidoff cigars big Davidoff Max Car Company car repair small garage MaxAuto As can be seen we have one free text feature column(name) and several categorical feature columns that may be empty. Brand has to be predicted. The categorical features have a large (1000+) number of possible values. The above …
Category: Data Science

Confusions about fp tree growth numerical-:

If fcam=3,Would conditional fp tree still be c=4 or what would it be? My guess is that it should be f=3,c=4,a=3,m=3. Am I right? Or what else would it be? Please guide. Also, in this below figure(same figure but more elaborated), how is the conditional fp tree of p-: c=3?? How My reasoning-: The conditional pattern base of p is fcam=2 and cb=1. fcma lies in 1 branch whereas cb lies in another branch.
Category: Data Science

Create clusters based on specific keywords

I am working on raw text data. I am using clustering to put together common words in the documents. My requirement is to create clusters based on a specific list of words i.e I want to get a group of words that are typically found with the user-given list of words. Visually, the clusters should look like below. Typically, the clustering techniques are focused on creating segregated clusters while I need segregated clusters with some overlap. The image shows the …
Category: Data Science

Association rules: Find the recipe for a list of ingredients

Assume I have big database of recipes. For each recipe I have a list of ingredients. Now I want to find all association rules in the form of (ingredient₁, ingriedient₂, …) → recipe. Is the Apriori algorithm suitable for my problem? As far as I am able to understand the Apriori algorithm is intended to find rules like X → Y where X and Y are subsets of the same superset. But in my case, X and Y are subsets …
Category: Data Science

Association Mining / rules are the statements appears to be true

I have a problem. I don't know how I cloud explain which of the three statements appears to be true and if my calculation below is invalid?` The following table summarizes the results of a medical survey where two groups of people were observed: one group consists of people who regularly drink tea, but no coffee. The people in the other group drink coffee but no tea. It was observed which of the people had good teeth and which ones …
Category: Data Science

Association Mining - is buying Independent?

I have a problem. I can't not solve this exerciese. What is the best way to solve this exerciese? What are the approaches for this kind of exerciese? The following table summarizes transactions in a supermarket where customers bought tomatoes and/or mozzarella cheese or neither. Is buying mozzarella independent of buying tomatoes in the data given above? If they are not independent, explain whether they are positively or negatively correlated, i.e. does buying one of them increase or decrease the …
Category: Data Science

Associationg Rules/ Mining is this association rule strong

I have a problem. I can't not solve this exerciese. What is the best way to solve this exerciese? What are the approaches for this kind of exerciese? The following table summarizes transactions in a supermarket where customers bought tomatoes and/or mozzarella cheese or neither. We study the association "mozzarella => tomatoes" (with the idea that many people like to eat mozzarella with tomatoes (plus fresh basil plus olive oil - yum!)) and assume a minimum support threshold of 25% …
Category: Data Science

Association rule mining for continuous variables

I'm trying to study the relationships between several numerical variables, eg. electricity generation between different stations at 30min intervals over several months. My data has the format I want to find the relationships between station1 and station2, station1 and station3, etc, and I'm trying to look for a more sophisticated method than correlation matrices. If necessary, I can also add columns showing time of day, day of week and other time variables extracted from the datetime column. Association rule mining …
Category: Data Science

what python package, or algorithm can be used to find a pattern like "buy A, then not buy B", like the opposite of association rule?

we know association rule mining can find the patterns like "bread --> milk". It means if you buy bread, then you might buy milk with high probability. But, is there a python package, or algorithm which can find patterns like "buy coke, then not buy pepsi"? So it means if you buy A, then you might not buy B with probability?
Category: Data Science

How to generate more market basket association rules for products with smaller basket sizes?

I'm working with data where many customers only buy 1-3 products at a time, meaning that there aren't enough products being purchased together for the market basket algorithm to determine associations. Any idea how I can get around this? I'm thinking of grouping transactions together by week or month to get larger basket sizes, but I'm skeptical of that approach since customers can place many orders in a week that have nothing to do with each other.
Category: Data Science

Association rules - Find 100% confidence rules

Suppose there are 100 items, numbered 1 to 100, and also 100 baskets, also numbered 1 to 100. Item i is in basket b if and only if i divides b with no remainder. Thus, item 1 is in all the baskets, item 2 is in all fifty of the even-numbered baskets, etc. For example Basket 12 consists of items {1, 2, 3, 4, 6, 12}. (a) Describe all the association rules that have 100% confidence. Give an example. I'm …
Category: Data Science

Apriori algorithm with tags

In apriori algorithm, we can create association rules with respect to the frequencies of the corresponding data set. My question is, what if we have tags data in addition to the transaction data. For example take a transaction data set like; TransactionID ItemGroups 1 A,C,D 2 B,C,E 3 A,B,C,E 4 B,E It is easy to set support, lift and confidence to reach an assosiation rule with the toy data above. Now consider we have an additional item-tag data like this; …
Category: Data Science

Given the product of the first purchase, what will the customer buy on her next purchase?

Say for example I have the following table of historical purchases: CustomerID OrderID ProductType MonthsBetweenPurch OrderNumber ---------------------------------------------------------------------- 130293699 1013448571 Womenswear 0 1 130293699 1013448571 Tops 0 1 130293699 1013574573 Tops 0 1 130293699 1013448577 Sweaters 1.5 2 130194668 1013735788 Tops 0 1 130194668 1013445564 Accessories 1 2 130194668 1013448575 Sweaters 0 1 130675885 1013426777 Tops 2 2 130675885 1013441869 Underwear 0 1 130675885 1013448444 Sweaters 0 1 130675885 1013448444 Accessories 0 1 130675885 1013448444 Accessories 0 1 where the MonthsBetweenPurch …
Category: Data Science

Association Rules

I'm using association rules for a project and noticed that there is a dearth of papers in the last 10-15 years on the topic although it seemed really popular 15-25 years ago. Is there any reason why?
Category: Data Science

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.