What mechanism allows the Transformer model to weigh the importance of different words in a sequence?
Answer options
A
Encoding Mechanism
B
Recurrent Mechanism
C
None of the given options
D
Decoding Mechanism
E
Self-Attention Mechanism
Correct answer: Self-Attention Mechanism
Explanation
The source marks the correct answer as: Self-Attention Mechanism.