TensorLearn
Back to Course
Neural Networks: From Scratch
Module 11 of 12

11. Deep Learning Cheatsheet

Derivatives

  • Power Rule: $ rac{d}{dx} x^n = n x^{n-1}$
  • Chain Rule: $ rac{dy}{dx} = rac{dy}{du} rac{du}{dx}$
  • ReLU: 1 if $x > 0$ else 0
  • Sigmoid: $sigma(x) (1 - sigma(x))$

Shapes

  • Input (Batch): $(N, D_{in})$
  • Weights: $(D_{in}, D_{out})$
  • Bias: $(D_{out},)$
  • Output: $(N, D_{out})$

Update Rule

$$ W leftarrow W - alpha rac{partial L}{partial W} $$

Autograd Algo

  1. Topological Sort of the Graph.
  2. Call backward() on parents in reverse order.

Mark as Completed

TensorLearn - AI Engineering for Professionals