You move backwards from the last layer using the chain rule of calculus and compute "gradients". Basically, you are calculating the gradient of the loss function with respect to each weight or bias 5/10
Backpropagation: Computing Gradients Using the Chain Rule
By
–
Leave a Reply