2.How backpropagation algorithm works
Micheal Nielsen, Jan 2016
The backpropagation algorithm computes the gradient of the cost function faster. But how? The backpropagation computes the partial derivative of the cost function with respect to any weight or bias in the network. The expression also reveals how quick the cost changes with respective to the change in weights or bias which in turn changes the overall behavior of the network. To compute the partial derivatives of cost function with respect to weights or bias it is necessary to compute the error which is related to partial derivatives.
Backpropagation algorithm is constructed from four basic equations to compute error and the gradient of the cost function.
# If the output neuron is saturated the learning is slow.
# weights input to the saturated neuron learns slow.
weight will learn slowly if either the input neuron has low activation or if the output neuron is saturated due to high or low activation.
Steps:
Backpropagation algorithm is constructed from four basic equations to compute error and the gradient of the cost function.
- Equation to compute error in the output layer.
- Equation to compute error in layer 1 in terms of error in the next layer.
- Equation to compute rate of change of cost with respect any bias in the network.
- Equation to compute rate of change of cost with respect any weight in the network.
# If the output neuron is saturated the learning is slow.
# weights input to the saturated neuron learns slow.
weight will learn slowly if either the input neuron has low activation or if the output neuron is saturated due to high or low activation.
Steps:
- Input: set activation for input layer
- Feedforward
- output error
- Backpropagate the error
- output: calculate the gradient of the cost function.