Clusterize Spectrum
I have pandas table which contains data about different observations, each one was measured in different wavlength. These observsations are different than each other in the treatment they have gotten. The table looks something like this:
name treatment 410.1 423.2 445.6 477.1 485.2 ....
0 A1 0 0.01 0.02 0.04 0.05 0.87
1 A2 1 0.04 0.05 0.05 0.06 0.04
2 A3 2 0.03 0.02 0.03 0.01 0.03
3 A4 0 0.02 0.02 0.04 0.05 0.91
4 A5 1 0.05 0.06 0.04 0.05 0.02
...
I would like to classify the different observations based on their spectrum (the numerical columns).
I have tried to run PCA and to paint it according to the treatment the observations got, and to compare it to the results of classifications like k-means and Spectral clustering, but i'm not sure that I choose the right methods because is seems all the time like the clusters are too much like euclidean distance and i'm not sure that they take into account the spectrum (I have used all the numerical columns for the prediction).
This is for exampel the comparison between the PCA+Colors compared to the Spectral cllasification:
PCA:
classification( the points located according to PCA1 PCA2 but the colores are according the the classification:
as you can see here, it seems like the classification is based on real distance and I would like something that take into account all the numerical values.
So, i'm looking for any insights regard other methods of classifications that could give me better results or maybe other ideas how I can check if there are clusters inside my data based on the measurments in different columns, like if I could predict the treatment from the clusters
Topic spectral-clustering pca classification k-means clustering
Category Data Science