Glossary · Technical concept
Prompt Injection
A class of attack where user input contains instructions that override or manipulate the model's intended behaviour. Two flavours: direct injection (user types adversarial instructions) and indirect injection (instructions hidden in retrieved content the model processes). Mitigations include input filtering, instruction hierarchy enforcement, and output validation.
Framework references
- OWASP Top 10 for LLMs (LLM01)
- NIST AI 600-1