Tiktoken is a fast open-source Byte Pair Encoding (BPE) tokenizer developed by OpenAI. It is designed to be highly efficient, handling the tokenization of text for models like GPT-3.5, GPT-4, and their successors. It ensures that text is broken down into the exact token sequences the model was trained on.
Released by OpenAI as a Python and Rust library.
The standard tool for developers to estimate token usage and costs before sending requests to the OpenAI API.