Performing EDA on a dataset with missing features
I'm new to DS.
I want to perform EDA on such dataset, where these are the missing features stats of my train and test sets:
train:
Test_0 0 Test_1 31 Test_2 0 Test_3 141 Test_4 0 Test_5 0 Test_6 0 Test_7 0 Test_8 1045 Test_9 0 Test_10 0 Test_11 0 Test_12 0 Test_13 0 Test_14 0 Test_15 2967 Class 0 dtype: int64
test:
Test_0 0 Test_1 7 Test_2 0 Test_3 46 Test_4 0 Test_5 0 Test_6 0 Test_7 0 Test_8 279 Test_9 0 Test_10 0 Test_11 0 Test_12 0 Test_13 0 Test_14 0 Test_15 738 dtype: int64
I have 3616 data lines in total on my train set and 905 on my test set. How can I decide on which features to throw away and which to fill artificially (and how to fill - I read a bit about mean filling etc.)
If anyone can also point me to a guide that explains this issues I would appreciate it.
Thanks!
Topic exploratory-factor-analysis visualization data-cleaning
Category Data Science