KNN improvements (python)

I rencently had to work on a problem where the best baseline was knn (geolocalised data). I have different targets (binary classification, multiclass classification and regression) and associated metrics, so I use inddiferently knn for classification or regression.

This Baseline was easy to implement in Python (sklearn). I was wondering how to improve the baseline. I tried tuning the knn hyperparameters. Optimising k worked a bit, modifying distances didn't work (natural L2 distance worked best by far). Others models gave worst performance / needed significantly more tuning / compute power.

I was wondering how to improve taht baseline. My main constraint would be a relatively easy python implementation. I've done some research on github and didn't found anything convincing.

The wikipedia page hints at taking into account local density; But I wouldn't knwo where to start.

I also found some mention of a knn++ algo here but it seems to needs multiple views.

Any idea / implementation of knn improvements ?

Topic k-nn scikit-learn python

Category Data Science

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.