Linear discriminant analysis in R: how to choose the most suitable model?

Question

Linear discriminant analysis in R: how to choose the most suitable model?

Helen

2020年6月24日 16:38

The data set vaso in the robustbase library summarizes the vasoconstriction (or not) of subjects’ fingers along with their breathing volumes and rates.

 head(vaso)
 Volume  Rate Y
1   3.70 0.825 1
2   3.50 1.090 1
3   1.25 2.500 1
4   0.75 1.500 1
5   0.80 3.200 1
6   0.70 3.500 1

I want to perform a linear discriminant analysis in R to see how well these distinguish between the two groups. And I consider two cases:

ld - lda(Y ~ ., data=vaso)

ld1 - lda(Y ~ log(Volume)+log(Rate), data=vaso)

Please help me understand which model is better? What characteristics to look at?

Topic lda-classifier discriminant-analysis r

Category Data Science

Erwan · Accepted Answer · 2020年6月24日 16:38

I'm not familiar with LDA, but as far as I know you're not really changing the "model" (i.e. the way to measure impact) between the two versions, what you're changing is the features: in the 2nd version, instead of looking at whether the value of the feature impacts Y, you look at whether the log of the value of the feature impacts Y. The first version is of course the most natural way to look at features, the second is common but usually this is used when we already know that the distribution of the feature (or the relation between the feature and the response variable) makes it relevant.

Linear discriminant analysis in R: how to choose the most suitable model?

About