regarding computing the centroid of high dimensional data
In scikit-learn, or other python libraries, are there any existing implementations to compute centroid for high dimensional data sets?
In scikit-learn, or other python libraries, are there any existing implementations to compute centroid for high dimensional data sets?
You could try using np.mean
along the axis that you care about. Let's say you have 100 vectors of 1200 dimensions each, and you want a centroid vector of dimension 1200. Then the following code would work:
>>> import numpy as np
>>> data = np.random.rand(100, 1200)
>>> centroid = np.mean(data, axis=0)
>>> centroid.shape
(1200,)
Here's documentation for the function.
Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.