Prompt Injection

What is Prompt Injection?

A security attack where a user inputs malicious instructions to bypass a model's safety guardrails or hijack its behavior.

Where did the term "Prompt Injection" come from?

Adversarial Machine Learning.

How is "Prompt Injection" used today?

Major security risk for AI agents.

Related Terms