Association Mining / rules are the statements appears to be true

I have a problem. I don't know how I cloud explain which of the three statements appears to be true and if my calculation below is invalid?`

The following table summarizes the results of a medical survey where two groups of people were observed: one group consists of people who regularly drink tea, but no coffee. The people in the other group drink coffee but no tea. It was observed which of the people had good teeth and which ones had bad teeth1.

We study the association of the attributes Tea and Good Teeth and assume a minimum support threshold of 40% and a minimum confidence threshold of 70%.

Calculation

1. Drink tea = good teeth 
Support(Drink tea = good teeth) = 0,4
Confience(Drink tea = good teeth) = 0,8
Lift(Drink tea = good teeth) = 1,33

2. good teeth = drink tea.
Support(good teeth = drink tea) = 0,4
Confience(good teeth = drink tea) = 0,66
Lift(good teeth = drink tea) = 0,625

Now assume that you discover two more studies with the following information:

consider the following statements:

  1. Tea improves dental health.
  2. Coffee improves dental health.
  3. No conclusion can be made whether tea or coffee influence dental health.

Please explain which of the three statements appears to be true. In particular, explain what this says about your results from the part above. Does it mean they are invalid?

Thinking

I think it is the 1) statement, because all the people who are trinking tea have 0,8 good teeths. 
And my results are not invalid.

But for me there is missing a good explanation with a small calculation.

Topic data association-rules data-mining

Category Data Science


"None of the above (3)"

The framing of the expected/ possible results is not correct. Two important things to keep in mind:

  1. Whenever you are doing such retrospective studies or association mining, you are looking for an association, correlation or relationship (not causality at the first place).
  2. You should be comparing the relative effect of tea over coffee on good dental health. Not making deterministic statements on use of one of it.

After taking into consideration the above mentioned issues. Whatever comes as a result of your experiment/ calculation criteria is your answer.

For e.g. if by performing association mining your support, confidence criteria is met, you can make a statement like, "people who drink only tea are likely to have a better dental health when compared with those who drink only coffee."

PS: This question belongs to stats stack exchange rather than here.


Statement #3 - "No conclusion can be made whether tea or coffee influence dental health." is the most useful interpretation of the studies.

"Correlation is not causation." Those studies show the relationship but it not clear what drives the effect. Maybe having certain teeth quality drives beverage choice. Or other variables that have not been measured (e.g., location or age) are influencing the observed relationships.

One option to directly evaluate statements #1 and #2 is an experiment where people are randomly assigned to a drink a beverage and dental health is measured.

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.