Laben Encoding for Target Classes: Any Integer or Consecutive Integers from Zero?
I'm handling an very conventional supervised classification task with three (mutually exclusive) target categories (not ordinal ones):
class1
class2
class2
class1
class3
And so one. Actually in the raw dataset the actual categories are already represented with integers, not strings like my example, but randomly assigned ones:
5
99
99
5
27
I'm wondering whether it is requested/recommended to re-assign zero-based sequential integers to the classes as labels instead of the ones above like this:
0
1
1
0
2
Does it have any advantage over the second version? (I'm going to apply sklearn's RandomForestClassifier and Keras' sparse_categorical_crossentropy to the task respectively, so no one-hot-encoding is needed in my case, only LabelEncoding.)
Topic supervised-learning scikit-learn classification python machine-learning
Category Data Science