Backward Pass: First compute the gradients of the output layer. Loss = (y – output) Gradient of Loss = (y – output) * sigmoid_derivative(output) Now calculate d_W2 which is gradient of the loss function with respect to W2. d_W2 = hidden_output.T • Gradient of Loss
Backward Pass: Computing Output Layer Gradients and Weight Updates
By
–
Leave a Reply