How to tell how much information I lose when I simplify the graph data structure with respect to unsimplified graph?

Question

How to tell how much information I lose when I simplify the graph data structure with respect to unsimplified graph?

Daniel Wiczew

2022年4月9日 11:01

I have the following problem: I have some sort of data (that I can't publish here, but they are in the form of points with XYZ coordinates) and I can represent them as a collection of graphs i.e. $Q = \{G_1, G_2 ... G_t\}$, where for every node there is an associated set of features, e.g. node $u_i$ has feature vector $\mathcal{F}_i$ and the features are changing between graphs (but graph structure does not). The resulting graphs are big in size with this approach. Therefore I decided to make the graphs smaller, by truncating some of the nodes and edges. And I would like to calculate how much information I lose when I simplify the graphs with respect to the not simplified graphs or original data. I would like to get something like "This graph explains 77% variance in the data" And the truncated graphs "This graph explains 55% variance in the data".

The question is then fallowing: How to tell how much information I lose when I simplify the graph data structure.

Edit: Also the feature vector can be replaced with weighted edges. I think it can make the problem a bit simpler to solve.

Topic pca graphs

Category Data Science

Gaurav Koradiya · Accepted Answer · 2020年7月3日 05:04

How would u simplify a Graph? It's important to know because ultimately u will have to compare graph. One thing u can do is that measure density of the graph. Its high-level rough idea. You may look on the internet. variance is density itself in terms of graph theory. https://www.quora.com/What-is-graph-density

Brian Spiering · Accepted Answer · 2020年6月2日 20:45

1

Brian Spiering answered at 2020年6月2日 20:45

Graph comparisons can be tricky.

One option is to take an Information Theory approach, something like "An information-theoretic, all-scales approach to comparing networks"

How to tell how much information I lose when I simplify the graph data structure with respect to unsimplified graph?

About