Backpropagation in NN
During backward pass, which gradients are kept and which gradients are discarded? Why are some gradients discarded? I know that forward pass is computing the output of the network given the inputs and computing the loss. Backward pass is computing the gradients for each weight loss.
Topic gradient backpropagation deep-learning neural-network
Category Data Science