A token is the fundamental unit of text processing for Large Language Models (LLMs). Rather than processing whole words or individual characters, models break text into 'tokens' which can represent words, sub-words, or punctuation. For English, 1,000 tokens corresponds to roughly 750 words.
Derived from lexical analysis in computer science and linguistics.
The universal currency for measuring LLM context windows, computational cost, and API pricing.