[ ABORT TO HUD ]
SEQ. 1
SEQ. 2
SEQ. 3
SEQ. 4

Document Intelligence

🔧 Foundry Tools (AI Services) 8 min 70 BASE XP

Extracting Structure from Documents

Document Intelligence (formerly Form Recognizer) uses AI to extract text, tables, key-value pairs, and structure from PDFs, images, and scanned documents.

Pre-Built Models

ModelExtractsUse Case
ReadText and structure from any documentGeneral OCR, digitization
LayoutTables, figures, sections, paragraphsComplex document parsing
InvoiceVendor, amounts, line items, datesAccounts payable automation
ReceiptMerchant, total, items, taxExpense management
ID DocumentName, DOB, document numberIdentity verification
CustomYour defined fieldsIndustry-specific forms

Integration with RAG

Document Intelligence is crucial for RAG pipelines — it converts unstructured PDFs into structured text that can be chunked, embedded, and indexed in Azure AI Search.

💡 Key Insight: For RAG systems, use the Layout model rather than the Read model. Layout preserves table structure and section hierarchy, producing much better chunks for embedding.
FOUNDRY VERIFICATION
QUERY 1 // 2
Why should you use the Layout model instead of Read for RAG pipelines?
Layout is cheaper
Layout preserves table structure and section hierarchy for better chunking
Read doesn't support PDFs
Layout is faster
Watch: 139x Rust Speedup
Document Intelligence | Foundry Tools (AI Services) — Azure Foundry Academy