What if outliers still exist after variable transformation?
I have a variable with a skewed distribution.
I applied BoxCox transformation and now the variable follows a Gaussian distribution. But, as seen in the image below in the boxplot, outliers still exist.
My question is:
Although after transformation, the variable distribution is nearly Gaussian, if there are still outliers, should we still select this transformation?
Or should we decide to use other techniques such as discretization in order to capture all outliers?
Topic transformation feature-engineering outlier
Category Data Science