In the node2vec model derivation, what does it mean for node representations to be "Symmetric in Feature Space"?

The main derivation of the probabilistic model in Node2Vec goes as follows (paper available on ArXiv: https://arxiv.org/pdf/1607.00653.pdf):

We formulate feature learning in networks as a maximum likelihood optimization problem. Let $G=(V, E)$ be a given network. Our analysis is general and applies to any (un)directed, (un)weighted network. Let $f: V \rightarrow \mathbb{R}^{d}$ be the mapping function from nodes to feature representaions we aim to learn for a downstream prediction task. Here $d$ is a parameter specifying the number of dimensions of our feature representation. Equivalently, $f$ is a matrix of size $|V| \times d$ parameters. For every source node $u \in V$, we define $N_{S}(u) \subset V$ as a network neighborhood of node $u$ generated through a neighborhood sampling strategy $S .$

We proceed by extending the Skip-gram architecture to networks [21], 24, We seek to optimize the following objective function, which maximizes the log-probability of observing a network neighborhood $N_{S}(u)$ for a node $u$ conditioned on its feature representation, given by $f$ : $$ \max _{f} \sum_{u \in V} \log \operatorname{Pr}\left(N_{S}(u) \mid f(u)\right) $$ In order to make the optimization problem tractable, we make two standard assumptions:

  • Conditional independence. We factorize the likelihood by assuming that the likelihood of observing a neighborhood node is independent of observing any other neighborhood node given the feature representation of the source: $$ \operatorname{Pr}\left(N_{S}(u) \mid f(u)\right)=\prod_{n_{i} \in N_{S}(u)} \operatorname{Pr}\left(n_{i} \mid f(u)\right) $$
  • Symmetry in feature space. A source node and neighborhood node have a symmetric effect over each other in feature space. Accordingly, we model the conditional likelihood of every source-neighborhood node pair as a softmax unit parametrized by a dot product of their features: $$ \operatorname{Pr}\left(n_{i} \mid f(u)\right)=\frac{\exp \left(f\left(n_{i}\right) \cdot f(u)\right)}{\sum_{v \in V} \exp (f(v) \cdot f(u))} $$

I'm a bit confused by the meaning of symmetry in feature space. I initially understood it as meaning that the probability for node $u$ to appear in the neighbourhood of node $n_i$ should be the same as the probability for node $n_i$ to appear in the neighbourhood of $u$, but I now realize that is not the case due to the normalization in the SoftMax function $\sum_{v \in V} \exp (f(v) \cdot f(u))$, which is not symmetrical.

Can someone provide some insight?

Topic graph-neural-network embeddings word2vec word-embeddings

Category Data Science

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.