Predicting the amount of new nodes discovered if a known node is extended in a graph

Question

Predicting the amount of new nodes discovered if a known node is extended in a graph

Alec Petridis

2021年12月13日 00:11

I'm currently working on a problem relating to Discord servers (sort of like a group chat within a social media platform), where I have a program recursively joining servers, looking for invites, and then joining those invites. As the amount of discovered servers grows, I've noticed that the amount of new servers discovered from searching old servers has decreased.

If I pick a server to search randomly, for each invite I find in that server, there's a pretty low probability that it will be a server that hasn't been encountered before.

My first approach was to first search servers that have the least amount of connections from other servers; this didn't solve the problem, and the program was still fairly inefficient in how many new servers it discovered.

I think this is because of the community-like nature of the servers. Louvain's algorithm can be used to identify communities, however, there is no way to tell what proportion of servers within a community is known. Is it possible to estimate what proportion of invites within an unvisited server will be new?

(neo4j 4.4.0)

Topic graphs neo4j

Category Data Science

Predicting the amount of new nodes discovered if a known node is extended in a graph

About