KNN improvements (python)
I rencently had to work on a problem where the best baseline was knn (geolocalised data). I have different targets (binary classification, multiclass classification and regression) and associated metrics, so I use inddiferently knn for classification or regression.
This Baseline was easy to implement in Python (sklearn). I was wondering how to improve the baseline. I tried tuning the knn hyperparameters. Optimising k worked a bit, modifying distances didn't work (natural L2 distance worked best by far). Others models gave worst performance / needed significantly more tuning / compute power.
I was wondering how to improve taht baseline. My main constraint would be a relatively easy python implementation. I've done some research on github and didn't found anything convincing.
The wikipedia page hints at taking into account local density; But I wouldn't knwo where to start.
I also found some mention of a knn++ algo here but it seems to needs multiple views.
Any idea / implementation of knn improvements ?
Topic k-nn scikit-learn python
Category Data Science