TensorLearn
Back to Course
NLP Specialist: BERT & Beyond
Module 3 of 8

3. The Math of Attention

1. The Query, Key, Value

$$ Attention(Q, K, V) = softmax( rac{QK^T}{sqrt{d_k}})V $$

  • Query: What am I looking for?
  • Key: What do I have?
  • Value: What is the information?

Mark as Completed

TensorLearn - AI Engineering for Professionals