Can I "fit" a k-nearest neighbors classifier without precomputing anything?
I am currently trying to fit a KNeighborsClassifier (scikit-learn implementation) to about a gigabyte of training data. From every resource I've read online, a k-nearest-neighbors classifier is a lazy classifier in that the fitting process is simply storing a copy of the training data. Then when you make a prediction, you search through the training data to make the prediction.
But when I call fit, it fails, telling me it cannot allocate 800 gigabytes of memory. I believe it's trying to precompute a tree or some type of structure to make future queries faster; but I don't want this behavior because it renders the classifier unusable. For my application, I'm okay with slow classifications. I want the lazy implementation.
Is there a way to use a KNeighborsClassifier from scikit-learn without having it use large amounts of memory during the fitting?
Topic scikit-learn classification python
Category Data Science