[ ABORT TO HUD ]
SEQ. 1
SEQ. 2

Content Filtering & Prompt Shields

🛡️ Evaluation & Safety 9 min 80 BASE XP

Automated Safety Guards

Azure AI Foundry provides multi-layered content safety powered by Azure AI Content Safety:

Content Filter Categories

CategoryWhat It DetectsSeverity Levels
HateHate speech, discriminationLow / Medium / High
SexualExplicit or suggestive contentLow / Medium / High
ViolenceViolent content or threatsLow / Medium / High
Self-HarmSelf-harm instructions or promotionLow / Medium / High

Advanced Protections (Updated 2026)

  • Prompt Shields — Detects and blocks prompt injection and cross-domain jailbreak attacks before they reach the model.
  • Groundedness Detection & Correction — Identifies ungrounded responses and (new in preview) can automatically rewrite text to align with the provided source documents.
  • Protected Material — Detects copyrighted text and, with the new Code integration, flags output matching public GitHub repositories (including citation capabilities).
  • Task Adherence (Preview) — Monitors agentic workflows to identify discrepancies between the LLM's actions and the intended task (e.g., misaligned tool invocations).
🚧 Important: Content filters are applied to both inputs (prompts) and outputs (completions). You can configure different thresholds for each, or create custom filter policies per deployment.
FOUNDRY VERIFICATION
QUERY 1 // 2
What does the Task Adherence feature monitor?
User login times
Discrepancies between an agent's actions/tool use and its intended task
Network latency
Token consumption limits
Watch: 139x Rust Speedup
Content Filtering & Prompt Shields | Evaluation & Safety — Azure Foundry Academy