Relationships between groups of features against independent variables
I have several groups of features that I'd like to test against independent variables. The idea is to find which groups tend to be associated with a specific value of an independent variable.
Let's take the following example where s
are samples, f
are features, i
are independent variables associated with each s
.
s1 s2 s3 s4 ....
f1 0.3 0.9 0.7 0.8
f2 ...
f3 ...
f4 ...
f5 ...
i1 low low med high
i2 0.9 1.6 2.3 10.5
Features f1, f2, f3
belong to group1
and f4,f5
belong to group2
. If I wanted to find whether a given feature tended to be associated with a given independent variable, I could regress each feature vs i2
or an encoded i1
and test whether there is an association between feature and independent variable.
But now I'm wondering, is it possible to test whether a group of features tends to be associated with an independent variable? I'm not sure how to approach this problem.
One idea is to test each independent variable against all features in each group using multilinear regression. The model to regress would contain only features in each group separately, so in this case we would have $2*2$ models in total (for group1
and group2
, and for 2 independent variables).
I have a feeling that this could also be formulated as a classification problem, but not really sure how.
Topic linear-regression statistics predictive-modeling
Category Data Science