Back to Course
Neural Networks: From Scratch
Module 4 of 12
4. The Computational Graph
1. The Chain Rule: Visualized
How does a weight in the first layer affect the error in the last layer? We propagate the error backwards, layer by layer.
$$ rac{dL}{dw} = rac{dL}{dy} * rac{dy}{dw} $$
The Graph
Think of the calculation as a tree. We walk from the leaves (Loss) back to the roots (Weights).
(Loss)
|
(Pred)
/ \
(w2) (h)
/ \
(w1) (x)
2. Autograd
We will build a graph engine that remembers operations to calculate these derivatives automatically.
- Forward: $y = x * w$
- Backward: $x.grad += y.grad * w$