How to apply K-Medoids in PySpark?

EchoCache

2022年3月15日 05:05

the pyspark ml library does not provide any clustering methods for K-Medoids. So my question is, how can one apply K-Medoids in a pyspark context?

Topic pyspark apache-spark python clustering

Category Data Science

Brian Spiering answered at 2020年2月2日 19:33

There is a k-medoids clustering for PySpark at spark-packages.org/package/tdebatty/spark-kmedoids and the source code is github.com/tdebatty/spark-kmedoids.

It can be installed with:

> $SPARK_HOME/bin/spark-shell --packages tdebatty:spark-kmedoids:0.1.2

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.