Approaches on grouping/clustering network device data

Question

Approaches on grouping/clustering network device data

Tahaga

2022年5月12日 15:25

So I come from more of a computer science background, and recently have been trying to find a solution to a data-centered problem. I would like to try experimenting different data-science methods on my dataset, but I'd like to decide on which ones are the most interesting, and most importantly, why (and why some are not interesting for that case) : basically, the more you tell me about your thought process, the better, I'm trying to learn from it !

So let's say I track data on a network, and capture different communications. Ever communication is between a point A and a point B. My dataset stores, for every communication, properties about both A and B. Thing is, in different communications, the same point can have different n-uples as values for those observed properties. So if we track properties x and y, we can have an exchange where point A has (1, 12) and another one where it has (1, 15) to illustrate. The real values are for example addresses or identifiers.

What kind of approaches would make sense to group those n-uples into real points (per se, if I have (1,12) and (1,15) with other pairs in my dataset, group those to into one group which would correspond to point A) ? Which methods wouldn't, and why? What are drawbacks to the ones you think would correspond ?

I have thought of a statistical/bayesian approach, of fuzzy logic and of some sort of clustering or even ML. However for every one of them I can't really convince myself it is adapted. Any input our thought would be appreciated, mostly to learn the right way of thinking :) thank you !

Topic data-analysis data data-mining

Category Data Science

Approaches on grouping/clustering network device data

About