Data wrangling dates

I have a feature with data creation dates. I have normalized them all to the same format and split them to 'day', 'month' and 'year' columns. But now I have a question. Should I apply normalization or standardization to these columns, or on dates this does not have sense?

Topic data-wrangling data-cleaning

Category Data Science


You might want to apply one-hot encoding instead. These are not really continuous features. If you consider each day of the week or month of the year a category, then you can instead treat them as categorical variables.

The year is trickier as it does not repeat itself. I would suggest to maybe instead of using the year to use a date difference: which can now be treaded as a continuous variable. You can do any regular scaling (standard scaling, max abs scaling ...)

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.