Updated February 2026 — 40+ models

AI API Cost Comparison Calculator

Q: Which model should I choose for my project?

It depends on your priorities. For highest quality reasoning, use flagship models like Claude Opus 4, o3, or Gemini 2.5 Pro. For latency-sensitive applications, look at fast-tier models like GPT-4.1 Mini or Gemini 2.5 Flash. For high-volume cost-sensitive workloads, budget models like GPT-4.1 Nano or Amazon Nova Micro can cut costs by 90% or more.

Compare pricing across 40+ AI models from OpenAI, Anthropic, Google, Meta, Mistral, and more. Adjust your usage below to estimate monthly costs instantly.

Potential Savings

Switching from GPT-5.4 Pro to Llama 3.2 3B

$359.73/month

58 models1K in / 500 out100 req/day

Model	Provider	Input $/1M	Output $/1M	Monthly Cost	Context	Latency
Llama 3.2 3BBest Valuebudget	Meta	$0.06	$0.06	$0.270	131K	fast
Ministral 3 3Bbudget	Mistral	$0.10	$0.10	$0.450	128K	fast
Mistral NeMobudget	Mistral	$0.15	$0.15	$0.675	128K	fast
Ministral 3 8Bbudget	Mistral	$0.15	$0.15	$0.675	128K	fast
Devstral Small 2budget	Mistral	$0.10	$0.30	$0.750	131K	fast
Mistral Small 3.2budget	Mistral	$0.10	$0.30	$0.750	128K	fast
Llama 3.2 90B Visionmid	Meta	$0.18	$0.18	$0.810	131K	fast
Llama 3.2 11B Visionbudget	Meta	$0.18	$0.18	$0.810	131K	fast
DeepSeek V4 Flashbudget	DeepSeek	$0.14	$0.28	$0.840	128K	fast
Gemini 2.5 Flash-Litebudget	Google	$0.10	$0.40	$0.900	1.0M	fast
Ministral 3 14Bbudget	Mistral	$0.20	$0.20	$0.900	128K	fast
Qwen 3.5 Flashbudget	Qwen	$0.10	$0.40	$0.900	1.0M	fast
Mistral Small 4mid	Mistral	$0.15	$0.60	$1.35	128K	fast
Llama 4 Scoutmid	Meta	$0.18	$0.59	$1.43	328K	fast
Grok 3 Minimid	xAI	$0.30	$0.50	$1.65	131K	fast
Llama 4 Maverickflagship	Meta	$0.27	$0.85	$2.08	1.0M	fast
Codestralmid	Mistral	$0.30	$0.90	$2.25	256K	fast
GPT-5.4 Nanobudget	OpenAI	$0.20	$1.25	$2.48	400K	fast
Claude 3 Haikubudget	Anthropic	$0.25	$1.25	$2.63	200K	fast
Gemini 3.1 Flash-Lite Previewbudget	Google	$0.25	$1.50	$3.00	1.0M	fast
Qwen 3 Coder Flashmid	Qwen	$0.30	$1.50	$3.15	33K	fast
Mistral Large 3flagship	Mistral	$0.50	$1.50	$3.75	128K	fast
Magistral Smallmid	Mistral	$0.50	$1.50	$3.75	128K	fast
Llama 3.3 70Bmid	Meta	$0.88	$0.88	$3.96	128K	fast
Mistral Medium 3flagship	Mistral	$0.40	$2.00	$4.20	131K	fast
Devstral 2mid	Mistral	$0.40	$2.00	$4.20	131K	fast
Gemini 2.5 Flashmid	Google	$0.30	$2.50	$4.65	1.0M	fast
Amazon Nova 2 Omnimid	Amazon	$0.30	$2.50	$4.65	1.0M	fast
Amazon Nova 2 Litemid	Amazon	$0.30	$2.50	$4.65	1.0M	fast
Qwen 3.5 Plusmid	Qwen	$0.40	$2.40	$4.80	262K	fast
Gemini 3 Flash Previewmid	Google	$0.50	$3.00	$6.00	1.0M	fast
QwQ Plusflagship	Qwen	$0.80	$2.40	$6.00	131K	medium
Claude Haiku 3.5budget	Anthropic	$0.80	$4.00	$8.40	200K	fast
GPT-5.4 Minimid	OpenAI	$0.75	$4.50	$9.00	400K	fast
DeepSeek V4 Promid	DeepSeek	$1.74	$3.48	$10.44	128K	medium
Claude Haiku 4.5budget	Anthropic	$1.00	$5.00	$10.50	200K	fast
Qwen 3 Coder Plusflagship	Qwen	$1.00	$5.00	$10.50	33K	medium
Qwen 3 Maxflagship	Qwen	$1.20	$6.00	$12.60	33K	medium
Magistral Mediumflagship	Mistral	$2.00	$5.00	$13.50	128K	medium
Pixtral Largeflagship	Mistral	$2.00	$6.00	$15.00	128K	medium
Gemini 2.5 Proflagship	Google	$1.25	$10.00	$18.75	1.0M	medium
Amazon Nova 2 Proflagship	Amazon	$1.25	$10.00	$18.75	1.0M	medium
Grok 4.20flagship	xAI	$2.00	$10.00	$21.00	256K	fast
Grok 4flagship	xAI	$2.00	$10.00	$21.00	256K	medium
Gemini 3.1 Pro Previewflagship	Google	$2.00	$12.00	$24.00	1.0M	medium
GPT-5.3 Chatflagship	OpenAI	$1.75	$14.00	$26.25	200K	medium
GPT-5.3 Codexflagship	OpenAI	$1.75	$14.00	$26.25	200K	medium
GPT-5.4flagship	OpenAI	$2.50	$15.00	$30.00	400K	medium
Claude Sonnet 4.6mid	Anthropic	$3.00	$15.00	$31.50	1.0M	fast
Claude Sonnet 4.5mid	Anthropic	$3.00	$15.00	$31.50	200K	fast
Claude Sonnet 4mid	Anthropic	$3.00	$15.00	$31.50	200K	fast
Grok 3flagship	xAI	$3.00	$15.00	$31.50	131K	medium
Claude Opus 4.7flagship	Anthropic	$5.00	$25.00	$52.50	1.0M	medium
Claude Opus 4.6flagship	Anthropic	$5.00	$25.00	$52.50	1.0M	medium
Claude Opus 4.5flagship	Anthropic	$5.00	$25.00	$52.50	200K	medium
Claude Opus 4.1flagship	Anthropic	$15.00	$75.00	$157.50	200K	slow
Claude Opus 4flagship	Anthropic	$15.00	$75.00	$157.50	200K	slow
GPT-5.4 Proflagship	OpenAI	$30.00	$180.00	$360.00	400K	slow

Llama 3.2 3BBest Value

Metabudget

$0.270

/month

Input $/1M

$0.06

Output $/1M

$0.06

Context

131K

Latencyfast

Ministral 3 3B

Mistralbudget

$0.450

/month

Input $/1M

$0.10

Output $/1M

$0.10

Context

128K

Latencyfast

Mistral NeMo

Mistralbudget

$0.675

/month

Input $/1M

$0.15

Output $/1M

$0.15

Context

128K

Latencyfast

Ministral 3 8B

Mistralbudget

$0.675

/month

Input $/1M

$0.15

Output $/1M

$0.15

Context

128K

Latencyfast

Devstral Small 2

Mistralbudget

$0.750

/month

Input $/1M

$0.10

Output $/1M

$0.30

Context

131K

Latencyfast

Mistral Small 3.2

Mistralbudget

$0.750

/month

Input $/1M

$0.10

Output $/1M

$0.30

Context

128K

Latencyfast

Llama 3.2 90B Vision

Metamid

$0.810

/month

Input $/1M

$0.18

Output $/1M

$0.18

Context

131K

Latencyfast

Llama 3.2 11B Vision

Metabudget

$0.810

/month

Input $/1M

$0.18

Output $/1M

$0.18

Context

131K

Latencyfast

DeepSeek V4 Flash

DeepSeekbudget

$0.840

/month

Input $/1M

$0.14

Output $/1M

$0.28

Context

128K

Latencyfast

Gemini 2.5 Flash-Lite

Googlebudget

$0.900

/month

Input $/1M

$0.10

Output $/1M

$0.40

Context

1.0M

Latencyfast

Ministral 3 14B

Mistralbudget

$0.900

/month

Input $/1M

$0.20

Output $/1M

$0.20

Context

128K

Latencyfast

Qwen 3.5 Flash

Qwenbudget

$0.900

/month

Input $/1M

$0.10

Output $/1M

$0.40

Context

1.0M

Latencyfast

Mistral Small 4

Mistralmid

$1.35

/month

Input $/1M

$0.15

Output $/1M

$0.60

Context

128K

Latencyfast

Llama 4 Scout

Metamid

$1.43

/month

Input $/1M

$0.18

Output $/1M

$0.59

Context

328K

Latencyfast

Grok 3 Mini

xAImid

$1.65

/month

Input $/1M

$0.30

Output $/1M

$0.50

Context

131K

Latencyfast

Llama 4 Maverick

Metaflagship

$2.08

/month

Input $/1M

$0.27

Output $/1M

$0.85

Context

1.0M

Latencyfast

Codestral

Mistralmid

$2.25

/month

Input $/1M

$0.30

Output $/1M

$0.90

Context

256K

Latencyfast

GPT-5.4 Nano

OpenAIbudget

$2.48

/month

Input $/1M

$0.20

Output $/1M

$1.25

Context

400K

Latencyfast

Claude 3 Haiku

Anthropicbudget

$2.63

/month

Input $/1M

$0.25

Output $/1M

$1.25

Context

200K

Latencyfast

Gemini 3.1 Flash-Lite Preview

Googlebudget

$3.00

/month

Input $/1M

$0.25

Output $/1M

$1.50

Context

1.0M

Latencyfast

Qwen 3 Coder Flash

Qwenmid

$3.15

/month

Input $/1M

$0.30

Output $/1M

$1.50

Context

33K

Latencyfast

Mistral Large 3

Mistralflagship

$3.75

/month

Input $/1M

$0.50

Output $/1M

$1.50

Context

128K

Latencyfast

Magistral Small

Mistralmid

$3.75

/month

Input $/1M

$0.50

Output $/1M

$1.50

Context

128K

Latencyfast

Llama 3.3 70B

Metamid

$3.96

/month

Input $/1M

$0.88

Output $/1M

$0.88

Context

128K

Latencyfast

Mistral Medium 3

Mistralflagship

$4.20

/month

Input $/1M

$0.40

Output $/1M

$2.00

Context

131K

Latencyfast

Devstral 2

Mistralmid

$4.20

/month

Input $/1M

$0.40

Output $/1M

$2.00

Context

131K

Latencyfast

Gemini 2.5 Flash

Googlemid

$4.65

/month

Input $/1M

$0.30

Output $/1M

$2.50

Context

1.0M

Latencyfast

Amazon Nova 2 Omni

Amazonmid

$4.65

/month

Input $/1M

$0.30

Output $/1M

$2.50

Context

1.0M

Latencyfast

Amazon Nova 2 Lite

Amazonmid

$4.65

/month

Input $/1M

$0.30

Output $/1M

$2.50

Context

1.0M

Latencyfast

Qwen 3.5 Plus

Qwenmid

$4.80

/month

Input $/1M

$0.40

Output $/1M

$2.40

Context

262K

Latencyfast

Gemini 3 Flash Preview

Googlemid

$6.00

/month

Input $/1M

$0.50

Output $/1M

$3.00

Context

1.0M

Latencyfast

QwQ Plus

Qwenflagship

$6.00

/month

Input $/1M

$0.80

Output $/1M

$2.40

Context

131K

Latencymedium

Claude Haiku 3.5

Anthropicbudget

$8.40

/month

Input $/1M

$0.80

Output $/1M

$4.00

Context

200K

Latencyfast

GPT-5.4 Mini

OpenAImid

$9.00

/month

Input $/1M

$0.75

Output $/1M

$4.50

Context

400K

Latencyfast

DeepSeek V4 Pro

DeepSeekmid

$10.44

/month

Input $/1M

$1.74

Output $/1M

$3.48

Context

128K

Latencymedium

Claude Haiku 4.5

Anthropicbudget

$10.50

/month

Input $/1M

$1.00

Output $/1M

$5.00

Context

200K

Latencyfast

Qwen 3 Coder Plus

Qwenflagship

$10.50

/month

Input $/1M

$1.00

Output $/1M

$5.00

Context

33K

Latencymedium

Qwen 3 Max

Qwenflagship

$12.60

/month

Input $/1M

$1.20

Output $/1M

$6.00

Context

33K

Latencymedium

Magistral Medium

Mistralflagship

$13.50

/month

Input $/1M

$2.00

Output $/1M

$5.00

Context

128K

Latencymedium

Pixtral Large

Mistralflagship

$15.00

/month

Input $/1M

$2.00

Output $/1M

$6.00

Context

128K

Latencymedium

Gemini 2.5 Pro

Googleflagship

$18.75

/month

Input $/1M

$1.25

Output $/1M

$10.00

Context

1.0M

Latencymedium

Amazon Nova 2 Pro

Amazonflagship

$18.75

/month

Input $/1M

$1.25

Output $/1M

$10.00

Context

1.0M

Latencymedium

Grok 4.20

xAIflagship

$21.00

/month

Input $/1M

$2.00

Output $/1M

$10.00

Context

256K

Latencyfast

Grok 4

xAIflagship

$21.00

/month

Input $/1M

$2.00

Output $/1M

$10.00

Context

256K

Latencymedium

Gemini 3.1 Pro Preview

Googleflagship

$24.00

/month

Input $/1M

$2.00

Output $/1M

$12.00

Context

1.0M

Latencymedium

GPT-5.3 Chat

OpenAIflagship

$26.25

/month

Input $/1M

$1.75

Output $/1M

$14.00

Context

200K

Latencymedium

GPT-5.3 Codex

OpenAIflagship

$26.25

/month

Input $/1M

$1.75

Output $/1M

$14.00

Context

200K

Latencymedium

GPT-5.4

OpenAIflagship

$30.00

/month

Input $/1M

$2.50

Output $/1M

$15.00

Context

400K

Latencymedium

Claude Sonnet 4.6

Anthropicmid

$31.50

/month

Input $/1M

$3.00

Output $/1M

$15.00

Context

1.0M

Latencyfast

Claude Sonnet 4.5

Anthropicmid

$31.50

/month

Input $/1M

$3.00

Output $/1M

$15.00

Context

200K

Latencyfast

Claude Sonnet 4

Anthropicmid

$31.50

/month

Input $/1M

$3.00

Output $/1M

$15.00

Context

200K

Latencyfast

Grok 3

xAIflagship

$31.50

/month

Input $/1M

$3.00

Output $/1M

$15.00

Context

131K

Latencymedium

Claude Opus 4.7

Anthropicflagship

$52.50

/month

Input $/1M

$5.00

Output $/1M

$25.00

Context

1.0M

Latencymedium

Claude Opus 4.6

Anthropicflagship

$52.50

/month

Input $/1M

$5.00

Output $/1M

$25.00

Context

1.0M

Latencymedium

Claude Opus 4.5

Anthropicflagship

$52.50

/month

Input $/1M

$5.00

Output $/1M

$25.00

Context

200K

Latencymedium

Claude Opus 4.1

Anthropicflagship

$157.50

/month

Input $/1M

$15.00

Output $/1M

$75.00

Context

200K

Latencyslow

Claude Opus 4

Anthropicflagship

$157.50

/month

Input $/1M

$15.00

Output $/1M

$75.00

Context

200K

Latencyslow

GPT-5.4 Pro

OpenAIflagship

$360.00

/month

Input $/1M

$30.00

Output $/1M

$180.00

Context

400K

Latencyslow

Providers Covered

Pricing data sourced directly from official documentation and verified monthly.

OpenAI Anthropic GoogleDeepSeekMistralMetaxAIAmazonQwen

Understanding AI API Pricing in 2026

As large language models become integral to software products, understanding the cost of AI APIs is critical for engineering teams and product managers. Every major provider — OpenAI, Anthropic, Google, Meta, Mistral, DeepSeek, xAI, and Amazon — prices their APIs per token, but the rates vary dramatically depending on model capability, latency, and whether you use standard or batch endpoints.

Token-based pricing means you pay separately for the text you send to the model (input tokens) and the text it generates (output tokens). Output tokens are typically 2 to 5 times more expensive than input tokens because they require more computation. Batch APIs, offered by providers like OpenAI and Anthropic, let you queue requests for asynchronous processing at roughly half the standard rate — ideal for offline workloads such as data labeling, summarization pipelines, and evaluation runs.

Choosing the right model requires balancing cost against quality, latency, and feature support. A chatbot handling millions of short messages needs a different model than a coding assistant working with long context windows. Batch processing can cut costs by 50% for non-real-time workloads, and prompt caching further reduces input token costs for providers that support it. Read our complete guide to AI API pricing in 2026 for a deep dive into how token pricing works across every provider.

For most production applications, the best strategy is to route different tasks to different models: use a flagship model like Claude Opus 4, GPT-4.1, or Gemini 2.5 Pro for complex reasoning, and a budget model like GPT-4.1 Nano, Gemini 2.0 Flash, or Amazon Nova Micro for simpler classification or extraction tasks. This tiered approach can reduce your monthly API bill by 80% or more without sacrificing quality where it matters. Explore our 5 proven strategies to reduce AI API costs to learn more.

Frequently Asked Questions

How is the monthly cost calculated?+

Monthly cost is calculated as: (input tokens per request x requests per day x 30 days / 1,000,000 x input rate) + (output tokens per request x requests per day x 30 days / 1,000,000 x output rate). When batch pricing is enabled, the batch input and output rates are used instead of standard rates for models that support it. This gives you a realistic estimate of what your monthly bill will look like at steady-state usage.

What is the difference between input and output tokens?+

Input tokens are the tokens in the text you send to the model -- this includes your system prompt, user message, and any context you provide. Output tokens are the tokens the model generates in its response. Output tokens are more expensive because they require sequential computation (each token depends on the previous one), while input tokens can be processed in parallel. A typical English word is roughly 1.3 tokens.

What is batch pricing and when should I use it?+

Batch pricing lets you submit a collection of API requests that are processed asynchronously, typically within a 24-hour window. In return, you get a significant discount -- usually 50% off standard pricing. This is ideal for workloads that do not require real-time responses, such as bulk document processing, dataset annotation, evaluation benchmarks, and nightly content generation pipelines. Not all providers offer batch endpoints; OpenAI and Anthropic are the most prominent supporters.

How often is the pricing data updated?+

We verify pricing data directly from each provider's official pricing page on a regular basis. AI API pricing changes frequently -- providers often lower prices as they optimize infrastructure, and new models launch with different price points. Each model entry includes the date it was last verified. If you notice a discrepancy, please let us know so we can update it promptly.

Which model should I choose for my project?+

It depends on your priorities. If you need the highest quality reasoning and can afford it, flagship models like Claude Opus 4, o3, or Gemini 2.5 Pro deliver the best results. For latency- sensitive applications (chatbots, autocomplete), look at fast-tier models like GPT-4.1 Mini, Claude Sonnet 4, or Gemini 2.5 Flash. For high-volume, cost-sensitive workloads where quality can be slightly lower, budget models like GPT-4.1 Nano, Amazon Nova Micro, or Mistral Small can cut costs by 90% or more. Many teams use a routing strategy that sends easy tasks to cheap models and hard tasks to premium ones.

Last updated: February 2026