RoPE (Rotary Position Embedding)

What is RoPE (Rotary Position Embedding)?

A method that encodes token position by rotating the query/key vectors. It generalizes better to longer sequence lengths than absolute positional embeddings.

Where did the term "RoPE (Rotary Position Embedding)" come from?

Standard in Llama, PaLM, and Mistral.

How is "RoPE (Rotary Position Embedding)" used today?

Replaced APE in almost all modern LLMs.

Related Terms