Why are correlation matrices used versus a matrix of R^2 values?
I'm relatively new to DS, so forgive me if this is a dumb question or in the wrong forum
When evaluating features it seems that almost everywhere a correlation matrix is used [df.corr(), cor(df, method=pearson)]
.
The way I understand it is that a correlation matrix describes the stregnth and directionality of the linear relationship (strong negative through strong positive) between each feature/predictor and all others.
HOWEVER
If $R^2$ indicates the amount of variability explained by the linear relationship, between each feature/predictor (as a proportion), wouldn't that provide more information for model selection or feature engineering?
THEREFORE
Would it make sense to always square a correlation matrix to get the $R^2$ values without reviewing the correlation matrix?
How relevant to model selection or feature engineering is understanding if there's an postive or negative correlation among features?