A security attack where a user inputs malicious instructions to bypass a model's safety guardrails or hijack its behavior.
Adversarial Machine Learning.
Major security risk for AI agents.