Are there differences in preprocessing nominal vs ordinal vs interval vs ratio data

Question

Are there differences in preprocessing nominal vs ordinal vs interval vs ratio data

Mindaugas Bernatavičius

2021年12月15日 13:10

I wonder are there significant differences that ought to be known when preprocessing nominal vs ordinal vs interval vs ratio.

Intuitively, it seems like encoding ordinal values should be performed using one-hot encoding to not introduce ordering assumptions artificially, and ordinal data (bad, better, best) using ordinal encoding (1,2,3) to preserve the order (although it does introduce scale, effectively making ordinal data into interval data it appears).

Also, scaling the data seems problematic - if I were to encode labels using ordinal encoding (1,2,3) and then scale / normalize all the features using z-score normalization it might suggest some kind of feature distance metric.

Would be grateful for more extended answers and/or some pointers to the literature.

Topic structured-data one-hot-encoding preprocessing encoding

Category Data Science

Are there differences in preprocessing nominal vs ordinal vs interval vs ratio data

About