Choosing Your Engine
The Gemini 3.1 family introduces a MoE (Mixture-of-Experts) architecture that dramatically improves efficiency.
- Gemini 3.5 Flash (NEW — May 2026): Released at Google I/O 2026 on May 19. The fastest model in the family, optimized for agentic throughput and coding. Features 1M token context, 65,536 max output tokens, and dynamic thinking that adjusts compute based on problem complexity. Pricing: ~$1.50/$9.00 per MTok. Native multimodal (text, images, audio, video, code).
- Gemini 3.1 Pro: The heavy lifter. Optimized for complex reasoning, agentic workflows, and massive document analysis.
- Gemini 3.1 Flash Image: Specialized for creating and analyzing visual assets at scale.
- Gemini 3.1 Flash-Lite: The most cost-efficient model in the family, optimized for high-volume, low-latency use cases where cost per token is critical.
🔮 Coming Soon: Gemini 3.5 Pro is expected in June 2026, bringing the next generation of deep reasoning capabilities to the Gemini family.