Top-k Sampling

What is Top-k Sampling?

Top-k sampling is a text decoding strategy that limits the model's choices to the 'k' most probable next tokens. By cutting off the long tail of low-probability words, it prevents the model from choosing irrelevant or nonsensical tokens, improving the coherence of generated text.

Where did the term "Top-k Sampling" come from?

An early and standard technique in Natural Language Processing for controlling randomness.

How is "Top-k Sampling" used today?

Widely available in almost all LLM APIs and libraries (like Hugging Face Transformers) as a basic parameter.

Related Terms

Top-p (Nucleus) Sampling
Temperature
Inference
Large Language Model (LLM)