Large Language Model (LLM)

What is Large Language Model (LLM)?

A Large Language Model (LLM) is a type of artificial intelligence trained on vast amounts of text data to understand, generate, and manipulate human language. Built on the Transformer architecture, LLMs learn statistical relationships between words (tokens) to predict the next word in a sequence. Their 'large' nature refers to both the massive size of their training datasets and the billions of parameters (weights) they contain, which allow them to exhibit emergent capabilities like reasoning, coding, and translation.

Where did the term "Large Language Model (LLM)" come from?

The modern era of LLMs began with the introduction of the Transformer architecture by Google in 2017 ('Attention Is All You Need') and scaled up with OpenAI's GPT series (2018+).

How is "Large Language Model (LLM)" used today?

LLMs have revolutionized AI, powering widely used tools like ChatGPT, Claude, and GitHub Copilot, and transforming industries from software development to content creation.

Related Terms