8. Stochastic Gradient Descent

1. Batches vs Epochs

Full Batch: Stable but slow. Uses all data for one step.
Mini-Batch (SGD): Noisy but fast. Uses 32 or 64 samples. The noise helps jump out of local minima.

One Epoch is seeing the entire dataset once. If you have 1000 items and batch size is 100, one epoch = 10 iterations.