Can I use low dimensional node features in graph convolutional networks?

I am trying to understand how GCNs work.

For example, the well known GraphSAGE algorithm considers a graph $G$ with node features $x_i$ of dimension $n$. Then it propagates the node features over the graph by message passing. In a basic implementation of GraphSAGE, message passing is implemented by first averaging over the neighbour features, which is then passed through a linear transformation $W_2$. Finally, the original node features are added multiplied by another linear transformation $W_1$:

$$ x_i' = W_1x_i + W_2\cdot \text{mean}_{j\in\mathcal{N}(i)}x_j. $$

The matrices $W_1$ and $W_2$ are of shape $n\times m$, where $m$ is the output dimensionality.

So here's my confusion. The learnable parameters here are the entries of the matrices $W_1$ and $W_2$. This means that if the node features are high dimensional then the matrices are bigger, with more parameters to learn. However, in the original paper the authors say that the algorithm works with node degrees (scalars!) as features vectors!

I have previously thought that it would make sense to have $mn$ because then dimensionality reduction can take place, which I thought was important for learning. However, when the matrices $W_1$ and $W_2$ are just scalars, there are only two parameters to learn and there's no dimensionality reduction.

  1. Could someone clarify what I am misunderstanding?
  2. Why does the algorithm work with $n$ small?
  3. Is there a scenario when one would use multiple (say two layers) where one first increases the dimensionality from $n$ to $n'$ and then reduces it to $m$ in the second layer? Doesn't this create information from nothing?

Topic graph-neural-network cnn convolutional-neural-network

Category Data Science

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.