Spatially constrained geospatial similarity

Question

Spatially constrained geospatial similarity

Overflow2341313

2022年2月22日 07:01

What's the current methodology for clustering geospatial data by features?

Example: I have some demographic dataset. Let's say this contains average home price and population density.

So, an example correlation here would be home price vs population density. But, the trick is how the clustering gets pulled. For example, an affluent area with high population density isn't the same as one with low population density. Applying a basic distance metric wouldn't take this into account since low vs highs could offset each other giving similar distances. This leads me to possibly some form of weighted clustering to pull centroids.

Not sure what methodology takes this into account.

Topic data-analysis regression geospatial scikit-learn pandas

Category Data Science

Jan Šimbera · Accepted Answer · 2020年5月28日 16:22

I assume you are trying to find a suitable distance metric based on features of different areas (although spatial distances might also easily be plugged in). In that case, I would first try to make sure the different features are correctly scaled, for example, to zero mean and unit variance.

If the result does not seem right, I would also try looking at different distance metrics. A simple alternative example is the L1 norm:

L1(a, b) = sum_x |x_a - x_b|

Spatially constrained geospatial similarity

About