[ ABORT TO HUD ]
SEQ. 1

Gemini Reasoning Models

🧠 Thinking & Deep Think 15m 300 BASE XP

Configurable Reasoning in Gemini

Starting with Gemini 2.5 and enhanced in 3.x, Gemini models support a thinking_level parameter that controls how much internal reasoning (chain-of-thought) the model performs before answering. This allows developers to trade off between speed/cost and reasoning depth.

ThinkingConfig

The ThinkingConfig object controls reasoning behavior via the thinking_level parameter:

LevelBehaviorBest For
LOWMinimal internal reasoning, fastest responseSimple lookups, classification, quick answers
MEDIUMBalanced reasoning depthStandard coding, analysis, summarization
HIGHMaximum reasoning, longest latencyComplex math, multi-step logic, research
from vertexai.generative_models import GenerativeModel

model = GenerativeModel("gemini-3.5-flash")

# Use LOW thinking for fast classification
fast_response = model.generate_content(
    "Classify this email as spam or not spam: 'You won a prize!'",
    generation_config={"thinking_config": {"thinking_level": "LOW"}}
)

# Use HIGH thinking for complex reasoning
deep_response = model.generate_content(
    "Prove that there are infinitely many prime numbers.",
    generation_config={"thinking_config": {"thinking_level": "HIGH"}}
)

Deep Think

Deep Think is Gemini 3.5 Pro's advanced reasoning mode that significantly extends the model's internal chain-of-thought for the most complex tasks — mathematical proofs, multi-file code refactoring, scientific analysis. Deep Think automatically engages when thinking_level is set to HIGH on capable models.

Thought Token Metering

Thinking tokens (the model's internal reasoning) are billed as output tokens. When using HIGH thinking, the model may generate thousands of internal tokens before producing a visible answer. Monitor your usage carefully — setting thinking_level appropriately per task is essential for cost control.

💡 Cost Tip: Use LOW for 80% of requests (simple Q&A, classification). Reserve MEDIUM for standard coding tasks. Only use HIGH/Deep Think for genuinely complex reasoning — math competitions, architectural decisions, multi-step research.
SYNAPSE VERIFICATION
QUERY 1 // 3
What does the thinking_level parameter control in Gemini?
The model's temperature
How much internal chain-of-thought reasoning the model performs before answering
The maximum output length
The safety filter strictness
Watch: 139x Rust Speedup
Gemini Reasoning Models | Thinking & Deep Think — Vertex AI Academy