3. Gradient Descent

Slope: The Derivative (Gradient).
Step: The Learning Rate.

1. Rolling Downhill

We cannot see the whole mountain. We can only feel the slope under our feet.

We move against the gradient to go down. $$ w_{new} = w_{old} - (lr * gradient) $$

python
# The essence of generic learning
weights -= learning_rate * weights.grad