Mixed Data Type Classification / Neighbor Algorithm

Here is a hypothetical simplified dataframe of my problem, which would be low dimensional (20ish features), containing some made-up information about certain dog breeds:

Breed Min_Weight Max_Weight Min_Height Max_Height is_friendly grp
Husky 10 20 30 35 True working
Poodle 8 17 15 30 False terrier

The algorithm would receive some information about a dog, and it would need to identify k-closest dog breeds based on the input data. It needs to be high performance.

Example: algorithm receives an unknown breed with data:

Weight Height is_friendly grp
18 23 1 terrier

Returns: n closest breeds from our sample dataframe, and the closeness

What sort of algorithm/model makes sense here, with multiple types of variables, ranges (min and max height, guessing I will need to generate data to fill in these ranges), and Boolean values?

Also, is there an approach to weight certain characteristics (ex: we are confident in the measurement of the unknown dogs weight so have that invoke more influence when choosing a breed, not confident about height, so lessen the influence, etc.)? How should I approach this problem?

Topic k-nn machine-learning-model classification algorithms clustering

Category Data Science

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.