The Vertex AI RAG Engine has been upgraded in 2026 to support Serverless RAG Mode, entirely eliminating the need to provision and manage vector databases like Pinecone or Vertex Vector Search manually.
The new AsyncRetrieveContexts API allows a single generative agent to seamlessly query across multiple, distinct data corpuses simultaneously. For example, an agent can retrieve technical specs from a codebase corpus and pricing data from a PDF corpus in a single operation.
You can now enforce strict schema validations on document metadata, allowing agents to filter vector searches using powerful SQL-like conditions before the semantic search even runs.