How to compute inner product between two networks'parameters
Consider a neural network with $f(x) = w^T_2 \sigma(w^T_1 x) $ where $\sigma(.)$ is a activation function such as ReLU. $w_2 \in R^{d \times k}, w_1 \in R^{k \times o}$ are two matrices. I would like to compute the inner product between two initialization of model's parameters $\theta =(w_2, w_1)$ and $\theta'=(w'_2, w'_1)$. Should we stack all elements of networks parameter into a single vector, i.e $\theta, \theta'$ will be a big vector with the number of entries equal to $w\times k + k \times o$. Then we just compute the inner product between two vectors. Is that a good way to compute the inner product between two model' parameters to measure how similarity they are ?
Topic data-product neural-network
Category Data Science