TensorLearn
Back to Course
Neural Networks: From Scratch
Module 12 of 12

12. Transformers from Scratch

1. Key, Query, Value

We implement Self-Attention using only NumPy.

python
# The Heart of Modern AI attention = softmax((Q @ K.T) / sqrt(d_k)) @ V

2. Positional Encoding

Transformers have no sense of order. We must inject "Position" by adding sine/cosine waves to the input embeddings.

Mark as Completed

TensorLearn - AI Engineering for Professionals