What is the most effective unsupervised ML algorithm to use when outliers are present in data set?
I am analyzing a portfolio of about 225 stocks and have gotten data for each of them based on their "Price/Earnings ratio", "Return on Assets", and "Earnings per share growth". I would like to cluster these stocks based on their attributes into 3 or 4 groups. However, there are substantial outliers in the data set. Instead of removing them altogether I would like to keep them in. What ML algorithm would be best suited for this? I have been told that K Means would not work so well since the outliers would skew the centroids of a particular cluster. Any and all thoughts welcome!
Topic unsupervised-learning outlier algorithms machine-learning
Category Data Science