Linear discriminant analysis in R: how to choose the most suitable model?

The data set vaso in the robustbase library summarizes the vasoconstriction (or not) of subjects’ fingers along with their breathing volumes and rates.

 head(vaso)
 Volume  Rate Y
1   3.70 0.825 1
2   3.50 1.090 1
3   1.25 2.500 1
4   0.75 1.500 1
5   0.80 3.200 1
6   0.70 3.500 1

I want to perform a linear discriminant analysis in R to see how well these distinguish between the two groups. And I consider two cases:

ld - lda(Y ~ ., data=vaso)
ld1 - lda(Y ~ log(Volume)+log(Rate), data=vaso)

Please help me understand which model is better? What characteristics to look at?

Topic lda-classifier discriminant-analysis r

Category Data Science


I'm not familiar with LDA, but as far as I know you're not really changing the "model" (i.e. the way to measure impact) between the two versions, what you're changing is the features: in the 2nd version, instead of looking at whether the value of the feature impacts Y, you look at whether the log of the value of the feature impacts Y. The first version is of course the most natural way to look at features, the second is common but usually this is used when we already know that the distribution of the feature (or the relation between the feature and the response variable) makes it relevant.

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.