Should I apply a transformation to columns with INTEGERS, in case I want to reduce the skewness of that column?
I am performing EDA on a dataset of Hotel Reservations. Target is Categorical stating if a given customer will cancel the reservation or not. Dataset has 25 features, 30244 entries.
I have two features stating the number of adults and the number of babies coming with the person who made the reservation.
- Number of adults can be 1, 2, 3, 4, or 5. (Range specifically given in dataset description)
- Number of babies in the train set take values 0, 1, or 2 (but a range is NOT specified in the dataset description)
When I checked for the skewness of the dataset, the number of adults and the number of babies columns had skewness 0.75 (I was going to apply log transformation to columns with skewness |0.75| to normalize their distribution)
As these two columns only contain integer values, I am unsure whether to apply a transformation or not because the transformation will give floating values to these columns.
Should I apply the transformation or not? Skew 1.710768 1.407404 0.858807
Topic transformation dataset data-cleaning
Category Data Science