What is the Purpose of Feature Selection
I have a small medical dataset (200 samples) that contains only 6 cases of the condition I am trying to predict using machine learning. So far, the dataset is not proving useful for predicting the target variable and is resulting in models with 0% recall and precision, probably due to how small the dataset is.
However, in order to learn from the dataset, I applied Feature Selection techniques to deduct what features are useful in predicting the target variable and see if this supports or contradicts previous literature on the matter.
However, when I reran my models using the reduced dataset, this still resulted in 0% recall and precision. So the prediction performance has not improved. But the features returned by the applying Feature Selection have given me more insight into the data.
So my question is, is the purpose of Feature Selection:
- to improve prediction performance
- or can the purpose be identifying relevant features in the prediction and learning more about the dataset
So in other words, is Feature Selection just a tool for improved performance, or can it be an end in itself?
Also, if using the subset of features returned by Feature Selection methods does not improve the accuracy or recall of the model how can I demonstrate that these feature are indeed relevant in my prediction?
If you can link some resources about this issue that would be very useful.
Thank you.
Topic feature-selection python dimensionality-reduction machine-learning
Category Data Science