Attention Mechanism

What is Attention Mechanism?

The attention mechanism allows a model to weigh the importance of different parts of the input sequence when generating each part of the output. It helps the model focus on relevant context regardless of its distance in the text.

Where did the term "Attention Mechanism" come from?

Key innovation in the 'Attention Is All You Need' paper.

How is "Attention Mechanism" used today?

Solved the long-range dependency problem in NLP, enabling coherent long-form text generation.

Related Terms

Transformer Architecture
Neural Network