Using PCA to cluster multidimensional data (RFM variables)

So i am performing k-means clustering on RFM variables (Recency, Frequency, Monetary). The RFM variables are in the form of quantiles (1-4). I used PCA and found the PCA components. I then used the elbow method to find the optimal number of clusters and then I use it in the k-means algorithm. Could anyone guide me if this is a correct method? Further, the clusters I get range on the graph, their axis ranges from -3 to 3 and I am not entirely sure why it ranges from that way.

Topic pca hierarchical-data-format k-means clustering

Category Data Science


Judging from the plot, there are no clusters.

K-means requires continuous variables to work well. The data you have has discrete steps (which causes the grid pattern in your plot).

There is no benefit of using PCA here. Use it only for visualization. The scale -3:3 that you don't understand is from PCA. So you probably have not understood PCA enough either.

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.