How Backpropagation Works

"I struggled here for long time"

Posted by dliu on January 15, 2020

“Where there is life, there is hope”

Backpropagation Objectives

Backpropagation algorithm is commonly used in updating parameters in neuron nets, which helps improving the model performance greatly.

Popular deep learning framework, like PyTorch, Tensorflow, already embeded backpropagation algorithm into their neuron nets.

The base mechanism of backpropagation is Chain-rule and Partial Derivative

Backpropagation aims to find which weight/bias/activation function has the relatively larger influence on minimize the cost and update them correspondingly.

Detailed Explain

Suppose two hidden layers MLP:

For example (the last layer):

\[Cost = a^{L} - y\] \[a^L = \sigma(z^L)\] \[z^L = w^La^{(L-1)} + b^L\]

This is the cost – weight/bias of hidden layer 2, to calculate the ratio between the weights (and biases) and the cost function. The ones with the largest ratio will have the greatest impact on the cost function and will give us ‘the most bang for our buck’.

Ex. We want to increase the prob to classify it into 2, we would like to know which neuron’s weight/bias/activation has larger influence so that we can adjust them to get our desired output - 2, such as the yellow line.

We will have to average the changes

Math formula

(output – the last hidden layer):

With multiple layers and neurons:

Calculating the gradient

Formula for the hidden layer 1:

Extra hidden layer:

Summarization

Implementation

Multi-layer Perceptron with Backpropagation

Resources:

backpropagation ideas and formula

intuitional explain