Matrix multiplication

I have downstream gradient for every $sample$ (each row for every $x_i$)

$$ \begin{bmatrix} 0.0062123 -0.00360166 -0.00479891 \\ -0.01928449 0.01240768 0.01493274 \\ -0.01751177 0.01140975 0.01469825 \\ 0.0074906 -0.00531709 -0.00637952 \end{bmatrix} $$

And I have my inputs (my local gradient)

$$ \begin{bmatrix} 0 0 \\ 0 1 \\ 1 0 \\ 1 1 \\ \end{bmatrix} $$

I want to calculate downgrade gradient and for this what I do is I transpose downstream gradient matrix and then do matrix multiplaction

downstream_gradient.T @ local_gradient

First question: I understand that outputs is SUM of gradients for each $x_i$. Am I right or am I wrong?

Second question: And do I need to divide matrix by len($X$) to get mean gradient?

Topic matrix gradient-descent machine-learning

Category Data Science

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.