Retrieval-Augmented Generation
RAG grounds AI model responses in your private data, reducing hallucination and enabling domain-specific answers.
The Foundry RAG Pipeline
- Ingest — Upload documents (PDFs, Word, web pages)
- Process — Document Intelligence extracts text and structure
- Chunk — Split into semantically meaningful segments
- Embed — Convert chunks to vectors using an embedding model
- Index — Store in Azure AI Search
- Retrieve — When a user asks a question, find relevant chunks
- Generate — Feed retrieved chunks to the LLM as grounding context
Quick Setup via Portal
The easiest way to set up RAG is through the Chat Playground:
- Open the Chat Playground
- Click "Add your data"
- Select Azure AI Search as the data source
- Upload your documents or connect an existing index
- The system automatically chunks, embeds, and indexes your data
💡 Key Insight: The portal's "Add your data" wizard handles the entire pipeline automatically. For production, use the SDK to customize chunking strategy, embedding model, and index configuration.