Scaling negative and positive variables when performing a k-means cluster analysis
I'm looking to perform a k-means cluster analysis on a set of data that contains variable ranges that contain both positive and negative values. Given the rangers vary so much the data will need to be scaled, but my concern is with the variables that contain negative value ranges. Should I perform some sort of log transformation on all the date so as to scale the data to positive values. For example:
Variable A: 3.4, 5.6,1.3,7.6,8.3
Variable B: 1,2,3,2,1
Variable C:-1.3, -1.4, -2.3, -4.2, -1.3
Topic k-means
Category Data Science