Instructions
AI systems can be manipulated through several attack vectors:
- Prompt injection — when an attacker crafts inputs that override or subvert an AI system’s intended instructions.
- Model poisoning — when malicious data is inserted into a model’s training set to alter its behavior.
- Data tampering — when attackers modify data used by the AI system, causing incorrect outputs or decisions.
Safeguards include input validation, sandboxing, dataset integrity checks, model monitoring, cryptographic signing, and human‑in‑the‑loop review.





