Automated Safety Guards
Azure AI Foundry provides multi-layered content safety powered by Azure AI Content Safety:
Content Filter Categories
| Category | What It Detects | Severity Levels |
| Hate | Hate speech, discrimination | Low / Medium / High |
| Sexual | Explicit or suggestive content | Low / Medium / High |
| Violence | Violent content or threats | Low / Medium / High |
| Self-Harm | Self-harm instructions or promotion | Low / Medium / High |
Advanced Protections (Updated 2026)
- Prompt Shields — Detects and blocks prompt injection and cross-domain jailbreak attacks before they reach the model.
- Groundedness Detection & Correction — Identifies ungrounded responses and (new in preview) can automatically rewrite text to align with the provided source documents.
- Protected Material — Detects copyrighted text and, with the new Code integration, flags output matching public GitHub repositories (including citation capabilities).
- Task Adherence (Preview) — Monitors agentic workflows to identify discrepancies between the LLM's actions and the intended task (e.g., misaligned tool invocations).
🚧 Important: Content filters are applied to both inputs (prompts) and outputs (completions). You can configure different thresholds for each, or create custom filter policies per deployment.