Feature reduction by removing certain columns in dataframe
I am working with the Emotion recognition model with the IEMOCAP dataset. For the feature extraction, I am taking mel-spectrogram and then convert it into a NumPy array and converting the array into a data frame of spectrogram features.
The generated dataframe has a shape of 2380 rows X 11761 columns
like
0 1 2 3 4 5 6 7 ... 11754 11755 11756 11757 11758 11759 11760 11761
262 0.036491 0.037793 0.041035 0.044644 0.047210 0.048467 0.049556 0.052137 ... 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
323 0.004577 0.004684 0.004951 0.005228 0.005357 0.005255 0.004969 0.004632 ... 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
680 0.003169 0.003221 0.003349 0.003490 0.003600 0.003682 0.003766 0.003860 ... 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
568 0.001942 0.001935 0.001934 0.001969 0.002071 0.002247 0.002456 0.002622 ... 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
769 0.002546 0.002483 0.002299 0.002050 0.001813 0.001661 0.001652 0.001793 ... 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
When I thoroughly checked, many columns have only 0.00000
in the last except few rows having some information.
My question is can I remove columns that have less than a certain number of nonzero elements in the column? Is the dimensionality reduction possible this way? Please guide me through this.
Topic feature-reduction feature-engineering
Category Data Science