Which AI models does the calculator support?

CostForge supports the latest April 2026 API pricing for Anthropic Claude (Opus 4.7, Sonnet 4.6, Haiku 3.5), OpenAI (GPT-4o, o3, o4-mini), Google Gemini (2.0 Flash, 2.0 Pro), DeepSeek (R2, V3), Meta Llama 4 Maverick, and Mistral Large 2.

How is the monthly API cost calculated?

The monthly cost is calculated by multiplying your estimated daily input and output tokens by 30 days, using the exact per-million token pricing from the respective AI providers.

Are prompt caching costs included?

Yes, for supporting models like Anthropic Claude, we factor in the significantly reduced cost of cached prompt tokens when calculating your total API spend.

// FREE TOOL · COMPARE AI API COSTS

CostForge

AI API Cost Calculator

Compare Claude vs GPT vs Gemini API costs side-by-side. Calculate daily and monthly spend, find the cheapest model for your use case. Free, no sign-up.

Use Case Presets

Input Tokens per Request

100100K

Output Tokens per Request

5032K

Requests per Day

1100K

Sort by:

💰 Cheapest

GPT-4.1 Nano

$12.00/mo

🔥 Most Expensive

$1,200/mo

📊 Potential Savings

99%

by picking the right model

📈 Total Tokens/Day

2.5M

across 1,000 requests

ModelInput $/1MOutput $/1MPer RequestDaily CostMonthly Cost

💚

GPT-4.1 Nano

OpenAI · 1M ctx

$0.1

$0.4

$0.0004

$0.400

$12.00

BEST

🔵

Gemini 2.5 Flash-Lite

Google · 1M ctx

$0.1

$0.4

$0.0004

$0.400

$12.00

⬡

DeepSeek V4-Flash

DeepSeek · 1M ctx

$0.14

$0.28

$0.0004

$0.420

$12.60

△

Mistral Small 4

Mistral · 128K ctx

$0.15

$0.6

$0.0006

$0.600

$18.00

⬡

DeepSeek V4-Pro

DeepSeek · 1M ctx

$0.435

$0.87

$0.0013

$1.30

$39.15

🟩

GPT-4.1 Mini

OpenAI · 1M ctx

$0.4

$1.6

$0.0016

$1.60

$48.00

△

Mistral Large 3

Mistral · 128K ctx

$0.5

$1.5

$0.0018

$1.75

$52.50

⚡

Gemini 2.5 Flash

Google · 1M ctx

$0.3

$2.5

$0.0019

$1.85

$55.50

🤖

Grok 4.3

xAI · 131K ctx

$1.25

$2.5

$0.0037

$3.75

$113

🟤

Claude Haiku 4.5

Anthropic · 200K ctx

$1

$5

$0.0045

$4.50

$135

△

Mistral Medium 3.5

Mistral · 256K ctx

$1.5

$7.5

$0.0067

$6.75

$203

💎

Gemini 2.5 Pro

Google · 1M ctx

$1.25

$10

$0.0075

$7.50

$225

🟢

GPT-4.1

OpenAI · 1M ctx

$2

$8

$0.0080

$8.00

$240

⚡

GPT-4o

OpenAI · 128K ctx

$2.5

$10

$0.010

$10.00

$300

🟠

Claude Sonnet 4.6

Anthropic · 200K ctx

$3

$15

$0.013

$13.50

$405

🌟

GPT-5.5

OpenAI · 200K ctx

$5

$20

$0.020

$20.00

$600

🟡

Claude Opus 4.7

Anthropic · 200K ctx

$5

$25

$0.022

$22.50

$675

🧠

OpenAI · 200K ctx

$10

$40

$0.040

$40.00

$1,200

Toggle Models

📝 Token Estimator

0 tokens

0 chars · ~4 chars/token

💰 Prompt Caching Savings

Enable Prompt Caching above to see savings per model. Claude offers up to 90% input cost reduction on cached system prompts.

💡 Cost Optimization Tips

Use Batch API

Most providers offer 50% discounts for batch/async requests. Toggle 'Batch Pricing' above to see savings.

Prompt Caching

Claude's prompt caching can reduce costs by 90% for repeated system prompts. Cache your system prompts and reuse them.

Right-Size Your Model

Use smaller models (Haiku, Mini, Flash) for simple tasks. Reserve flagship models for complex reasoning.

Reduce Token Usage

Compress prompts, use structured outputs (JSON mode), and limit max_tokens. Every token saved reduces cost.

Monitor & Set Budgets

Set hard spend limits in your API dashboard. Use rate limiting to prevent runaway costs during development.

Hybrid Routing

Route simple queries to cheap models and complex ones to flagship models. MCPlex Gateway supports this.

Why CostForge?

20 Models Compared

Side-by-side pricing for Claude, GPT-5.5, Gemini, DeepSeek V4, Grok 4.3, and Mistral — 6 providers in one view.

Real-Time Calculations

Drag sliders or type exact numbers. See per-request, daily, and monthly costs update instantly across all models.

6 Use Case Presets

Pre-configured token counts for Chatbot, Code Assistant, Data Processing, RAG Pipeline, Content Generation, and Summarization.

Batch vs Standard

Toggle between standard and batch pricing. See exactly how much you save with async/batch processing for each provider.

Visual Cost Bars

Horizontal bars make it easy to compare relative costs at a glance. The cheapest model is highlighted with a BEST badge.

Optimization Tips

Expert advice on reducing API costs: prompt caching, model routing, token compression, and budget monitoring strategies.

Always Up-to-Date

Pricing data is based on latest published rates from Anthropic, OpenAI, Google, DeepSeek, xAI, and Mistral as of May 2026.

100% Private

All calculations happen in your browser. No data is sent anywhere. No account needed, no tracking.

Frequently Asked Questions

How accurate are these prices?

Prices are based on the latest published pricing from Anthropic, OpenAI, Google, DeepSeek, xAI, and Mistral as of May 2026. We update regularly, but always verify against the official pricing pages for the most current rates. Actual costs may vary with prompt caching, commitments, or volume discounts.

What's the difference between standard and batch pricing?

Batch/async pricing is typically 50% cheaper but has higher latency (responses delivered within hours, not seconds). Use batch for non-real-time tasks like data processing, analysis, and bulk content generation.

How do I estimate my token usage?

A rough rule: 1 token ≈ 4 characters in English. A typical user message is 50-500 tokens, a code file is 1,000-5,000 tokens, and a long document is 10,000+ tokens. Output tokens are usually 20-50% of input tokens for most tasks.

Which model should I choose?

For simple Q&A and classification: use DeepSeek V4-Flash/Gemini Flash-Lite/Mistral Small (cheapest). For coding and analysis: use Sonnet 4.6/GPT-4.1/Grok 4.3 (best value). For complex reasoning: use Opus 4.7/GPT-5.5/o3/Gemini Pro (most capable). Always benchmark on your specific use case.

Does this include hidden costs like network, storage, etc.?

No. CostForge calculates pure API token costs only — what the providers charge per input/output token. Other costs like network bandwidth, compute for pre/post-processing, and database storage are separate and depend on your infrastructure.