Understanding a dataset (prior to applying ML models) with no metadata given

Question

Understanding a dataset (prior to applying ML models) with no metadata given

Aditya Kadrekar

2018年11月11日 08:43

How do you understand a dataset when there is no metadata given (no details about the attributes given in the dataset)? It is difficult to comprehend the attribute names as only the short forms are given.

It's given to me that 'pm2.5' is the target variable. How do I understand which independent variables will affect this target variable?

Topic data-analysis metadata dataset machine-learning

Category Data Science

arunppsg · Accepted Answer · 2018年11月11日 08:43

The aim is to predict pm2.5 ( target variable ).

Step 1: Data Cleaning. Remove unwanted features and fill the missing values.

Step 2: To learn about the features, perform data visualization. You can plot a linear plot with TEMP and pm2.5 and see how it varies with change in temp.

Step 3: The next step is to find the relationship between features. Some features are not needed for prediction. Remove those features.

Step 4: Apply a suitable machine learning technique and predict.

Understanding a dataset (prior to applying ML models) with no metadata given

About