k-nn - Geeks Mental

KNN improvements (python)

lcrmorin

2022年5月26日 07:41

I rencently had to work on a problem where the best baseline was knn (geolocalised data). I have different targets (binary classification, multiclass classification and regression) and associated metrics, so I use inddiferently knn for classification or regression. This Baseline was easy to implement in Python (sklearn). I was wondering how to improve the baseline. I tried tuning the knn hyperparameters. Optimising k worked a bit, modifying distances didn't work (natural L2 distance worked best by far). Others models gave …

Topic: k-nn scikit-learn python

Category: Data Science

Date transformation for KNN

Mapp

2022年5月24日 16:06

I have data set with date features like 01/01/2019 and I would like to use KNN. However, I cannot find a good transformation for dates that has a meaningful distance result for the last feature. For example: f1 | 1 | 2 | 3 | 4 | 01/01/2019 f2 | 10 | 3 | 12 | 1 | 14/01/2019 Does anyone have any recommendations?

Topic: k-nn data distance machine-learning

Category: Data Science

How solved "ValueError: y should be a 1d array, got an array of shape () instead."?

Asma Tolihan

2022年5月24日 08:47

from tkinter import * from tkinter import ttk from tkmacosx import Button top = Tk() top.title("Jobs") top.geometry("1000x800") line1 = LabelFrame(top, text='') line1.pack(expand = 'yes', fill = 'both') n = StringVar() categorychoosen = ttk.Combobox(line1, width = 27, textvariable = n) # Adding combobox drop down list categorychoosen['values'] = ('Advocate','Arts','Automation Testing','Blockchain','Business Analyst', 'Web Designing') categorychoosen.place(x=50, y=150) categorychoosen.current() name=Label(line3,text="Welcom to ... company",font =("Arial", 10)) name.place(x=0, y=0) n1 = StringVar() sectionchoosen = ttk.Combobox(line3, width = 27, textvariable = n1) # Adding combobox drop down …

Topic: k-nn overfitting python-3.x classification python

Category: Data Science

Using KNN to categorise inventory (physical stock items) - is it the best way?

tristar8

2022年5月16日 19:06

I'm working on a machine learning problem involving inventory (i.e. physical retail stock), however through the cleaning (outlier removal) process some of the items (via their corresponding transactions) will be removed. Therefore, I thought of using KNN to group similar items into respective categories. There are 1245 items The info for each item is Average Weighted Price Total Quantity Sold Total Revenue Achieved Min Sold per Transaction Max Sold per Transaction Min Sell Price Max Sell Price Number of Unique …

Topic: k-nn clustering machine-learning

Category: Data Science

Missing value Imputation in dataset

Bharathi A

2022年5月15日 19:03

I have two separate files for Testing and Training. In the training data, I am dropping rows that contain too many missing values . But , In the test data , I cannot afford to drop the rows so I have chosen to impute the missing values using KNN approach . My question is , to impute missing values in the test data using KNN , is it enough to consider only the test data ? As in , neighbors …

Topic: k-nn data-imputation data-cleaning machine-learning

Category: Data Science

What ways can i find two similar sets of customers use KNN?

Maths12

2022年5月12日 14:28

I have a study where i want to find users similar to a set of users (SEED). My data looks like a pivot by customer e.g. sample of SEED looks like (note i drop cust_id): cust_id | spend_food | spend_nike | spend_harrods | 1 | 145 | 45 | 32 | 2 | 85 | 89 | 0 | 4 | 23 | 67 | 1900 | 5 | 84 | 12 | 900 | So to find users similar …

Topic: k-nn cosine-distance similarity recommender-system machine-learning

Category: Data Science

Mixed Data Type Classification / Neighbor Algorithm

CyberBully2003

2022年5月4日 18:48

Here is a hypothetical simplified dataframe of my problem, which would be low dimensional (20ish features), containing some made-up information about certain dog breeds: Breed Min_Weight Max_Weight Min_Height Max_Height is_friendly grp Husky 10 20 30 35 True working Poodle 8 17 15 30 False terrier The algorithm would receive some information about a dog, and it would need to identify k-closest dog breeds based on the input data. It needs to be high performance. Example: algorithm receives an unknown breed …

Topic: k-nn machine-learning-model classification algorithms clustering

Category: Data Science

What does a leaf size of 1 in K-neighbors regression mean?

Caterina

2022年5月3日 17:51

I am doing hyperparameter tuning + cross validation and I'm constantly getting that the optimal size of the leaf should be 1. Should I worry? Is this a sign of overfitting?

Topic: k-nn hyperparameter-tuning cross-validation scikit-learn

Category: Data Science

adding a supervising process during knn process

lelorrain7

2022年5月3日 07:03

I am trying to improve my KNN regression process (I would like to use sklearn / python, but it doesn't matter).I would like to improve my results and to gain insight. Here is an example: I have data measured from an electric motor: an input voltage (U) and current (I) and an output torque (T) and speed (S). First intend is a simple approach where I'm giving those data in the state to a KNN algorithm and I use the …

Topic: k-nn regression scikit-learn python

Category: Data Science

Does knn extend the train dataset by test values during the prediction?

montty

2022年4月25日 01:07

Lets say I have 100 values in my dataset and split it 80% train 20% test. When predicting the last value, is the prediction based on previous 99 (80 test + 19 already predicted values) or only the original 80 train values? For example: if kd-tree is used, is every data point inserted into the tree during the prediction? Is it possible to use knn for the following scenario? I have 20 train values, when I add new observation I …

Topic: k-nn supervised-learning classification machine-learning

Category: Data Science

parallel work on KNN in python

Salman Al-haddad

2022年4月23日 19:40

I have a question, related to parallel work on python How I can use Processers =1,2,3... on k nearest neighbor algorithm when K=1, 2, 3,.. to find the change in time spent, speedup, and efficiency. What is the appropriate code for that?

Topic: time k-nn preprocessing python efficiency

Category: Data Science

Item-based recommender using K-NN

Anto

2022年4月23日 11:06

I'm trying to build an item-based recommender using k-nn. I have a list of items, all of which have some properties (features) in common. item var_1 var_2 var_3 var_4 var_5 item_1 0.171547232 a 0.908855471 0.292061808 0.285678293 item_2 0.131694336 b 0.432665234 0.501300418 0.756824175 item_3 0.144318764 b 0.238752071 0.487600679 0.203133779 item_4 0.249241125 b 0.921229689 0.003638622 0.606875991 item_5 0.414306046 b 0.190824352 0.937412611 0.1789091 item_6 0.909501131 c 0.847112499 0.548322302 0.060136059 item_7 0.37469644 c 0.282628025 0.211128351 0.125910578 item_8 0.308634676 d 0.174650423 0.705026302 0.440098246 item_9 0.039294192 …

Topic: k-nn scikit-learn python recommender-system machine-learning

Category: Data Science

Table from results of sknn function (klaR package) won't output

data_life

2022年4月17日 07:11

I have a data set with 6 variables that I'm trying to run the sknn function on and then output a table of the results to show k-NN results. I have updated the response variable to a factor to use as row and column headers in the table, and checked the data types of all other variables to make sure they are compatible (int and num). For some reason, no matter what I try, R freezes trying to pull the …

Topic: k-nn r

Category: Data Science

KNN efficient implementation

Nathan Jodo

2022年4月12日 00:03

The KNN algorithm is very handy and particularly suited to some of my problems, but I can't find any resources on how to implement it in production. As a comparative example, when I use a neural network, I already have at my disposal high-level tools allowing me to apply the neural network to examples (either library allowing me to smartly exploit the hardware of my devices when I want to do embedded, or infrastructures allowing me to use my neural …

Topic: k-nn representation implementation decision-trees performance

Category: Data Science

Problems with KNN using tidymodels

PicaR

2022年4月10日 15:37

I am analyzing a database and I want to perform a KNN. I am using the 'tidymodels' library and when I run the model, I get the following error: All models failed. See the `.notes` column. # Tuning results # 10-fold cross-validation repeated 5 times There were issues with some computations: - Error(s) x1000: Error in check_outcome(): ! For a classification model, the outcome should be a factor. Use collect_notes(object) for more information. The bbdd is composed of the following …

Topic: k-nn machine-learning-model classification r machine-learning

Category: Data Science

How to save a knn model?

Vincenzo Lavorini

2022年4月5日 03:28

I need to save the results of a fit of the SKlearn NearestNeighbors model: knn = NearestNeighbors(10) knn.fit(my_data) How do you save to disk the traied knn using Python?

Topic: k-nn scikit-learn python

Category: Data Science

Making Sense of this Error Message

Mr Prof

2022年4月4日 23:00

I am using a book and a video to learn how to use KNN method to classify movies according to their genres.This is my code: import numpy as np import pandas as pd r_cols = ['user_id', 'movie_id', 'rating'] ratings = pd.read_csv('C:/Users/dell/Downloads/DataScience/DataScience-Python3/ml-100k/u.data', sep='\t', engine='python', names=r_cols, usecols=range(3)) # The file is u.data from MovieLens print(ratings.head()) movieProperties = ratings.groupby('movie_id').agg({'rating': [np.size, np.mean]}) print(movieProperties.head()) movieNumRatings = pd.DataFrame(movieProperties['rating']['size']) movieNormalizedNumRatings = movieNumRatings.apply(lambda x: (x - np.min(x)) / (np.max(x) - np.min(x))) print(movieNormalizedNumRatings.head()) movieDict = {} with open('C:/Users/dell/Downloads/DataScience/DataScience-Python3/ml-100k/u.item') as …

Topic: k-nn error-handling pandas data-cleaning data-mining

Category: Data Science

k-Nearest Neighbours with time series data - how to obtain whole-time-period estimators

Statsanalyst

2022年4月2日 15:00

I have a large dataset for the activities performed by multiple staff in a factory over a long period of time - 01/01/2017 - present. The activities performed by the different staff are recorded at each point in time (since they interact with software). I have tabulated these to record the number of activities performed by each operator for each day. My table looks something like this: Date Name Activity UnitsProcessed Shift Team 01/10/2017 MMouse Soldering 1000 Shift A Team …

Topic: k-nn time-series r

Category: Data Science

Recommendations based on other products seen

2022年3月25日 10:02

I am trying to develop a basic book recommender system to get in touch with the field and start learning methods and how to prepare the data. The Dataframe I am using is pretty plain, it has the following structure (this is a simplified example): number type username product publishing_dt price genres 0 34 access kerrigan 130365 2019-12-10 16.99 fantasy, kids 1 1 order kerrigan 76863 2020-01-15 4.66 action, crime 2 1 order 45michael 76863 2020-01-15 4.66 action, crime 3 …

Topic: k-nn python recommender-system machine-learning

Category: Data Science

New classification in Machine Learning KNN model

Inuraghe

2022年3月21日 15:11

This is my example of KNN model (I write it using R): library(gmodels) library(caret) library(class) db_class <- iris row_train <- sample(nrow(db_class),nrow(db_class)*0.8) db_train_x <- db_class[row_train,-ncol(db_class)] db_train_y <- db_class[row_train,ncol(db_class)] db_test_x <- db_class[-row_train,-ncol(db_class)] db_test_y <- db_class[-row_train,ncol(db_class)] model_knn <- knn(db_train_x,db_test_x,db_train_y,12) summary(model_knn) CrossTable(x=db_test_y,y=model_knn,prop.chisq = FALSE) confusionMatrix(data=factor(model_knn),reference=factor(db_test_y)) So, this is a supervised KNN models. How can I classify a new registration? I have this new registration: new_record <- c(5.3,3.2,2.0,0.2) How can I classify it using the previous model?

Topic: k-nn supervised-learning r machine-learning

Category: Data Science

About