How much does a token cost?

Token costs vary by model. The cheapest models start at $0.05 per million input tokens (GPT-5 Nano). Premium models like Claude Opus or GPT-5 Pro can cost $5-15 per million input tokens. Use our calculator to estimate costs for your specific usage.

LLM Cost Calculator

Free token cost calculator for every major LLM API. Compare pricing across OpenAI, Anthropic, and Google models — or browse the full pricing comparison.

Calculate Your Cost

Provider

Model

Input Tokens

Output Tokens

Select a model to see the cost estimate

Get API Access

How Token Pricing Works

LLM APIs charge per token — a unit of text roughly equal to 4 characters or 0.75 words. Every API call has two cost components: input tokens (your prompt) and output tokens (the model's response). Output tokens almost always cost more, typically 2-8x the input price. Use this token cost calculator to estimate your spend before committing to a provider.

LLM Pricing Comparison Table

Model	Input $/1M tokens	Output $/1M tokens	Type	Notes
claude-3-7-sonnet	$3.00	$15.00	Flat
claude-haiku-3	$0.250	$1.25	Flat
claude-haiku-3-5	$0.800	$4.00	Flat
claude-haiku-4-5	$1.00	$5.00	Flat
claude-opus-3	$15.00	$75.00	Flat
claude-opus-4-0	$15.00	$75.00	Flat
claude-opus-4-1	$15.00	$75.00	Flat
claude-opus-4-5	$5.00	$25.00	Flat
claude-opus-4-6	$5.00	$25.00	Flat
claude-opus-4-7	$5.00	$25.00	Flat
claude-sonnet-4-0	$3.00	$15.00	Flat
claude-sonnet-4-5	$3.00	$15.00	Flat
claude-sonnet-4-6	$3.00	$15.00	Flat
gemini-2.0-flash	$0.150	$0.600	Flat
gemini-2.0-flash-lite	$0.075	$0.300	Flat
gemini-2.5-computer-use	$1.25 / $2.50	$10.00 / $10.00	Breakpoint	Threshold: 200K tokens
gemini-2.5-flash	$0.300	$2.50	Flat
gemini-2.5-flash-image	$0.300	$2.50	Multimodal	text, image
gemini-2.5-flash-lite	$0.100	$0.400	Flat
gemini-2.5-flash-native-audio	$0.500	$2.00	Multimodal	text, audio
gemini-2.5-flash-preview-tts	$0.500	$10.00	Flat
gemini-2.5-pro	$1.25 / $2.50	$10.00 / $10.00	Breakpoint	Threshold: 200K tokens
gemini-2.5-pro-preview-tts	$1.00	$20.00	Flat
gemini-3-flash	$0.500	$3.00	Flat
gemini-3-pro-image-preview	$2.00	$12.00	Multimodal	text, image
gemini-3-pro-preview	$2.00 / $4.00	$12.00 / $12.00	Breakpoint	Threshold: 200K tokens
gemini-3.1-flash-image-preview	$0.500	$3.00	Multimodal	text, image
gemini-3.1-flash-lite-preview	$0.250	$1.50	Flat
gemini-3.1-pro-preview	$2.00 / $4.00	$12.00 / $12.00	Breakpoint	Threshold: 200K tokens
gpt-4.1	$2.00	$8.00	Flat
gpt-4.1-mini	$0.400	$1.60	Flat
gpt-4.1-nano	$0.100	$0.400	Flat
gpt-4o	$2.50	$10.00	Flat
gpt-4o-mini	$0.150	$0.600	Flat
gpt-5	$1.25	$10.00	Flat
gpt-5-codex	$1.25	$10.00	Flat
gpt-5-mini	$0.250	$2.00	Flat
gpt-5-nano	$0.050	$0.400	Flat
gpt-5-pro	$15.00	$120.00	Flat
gpt-5.1	$1.25	$10.00	Flat
gpt-5.1-codex	$1.25	$10.00	Flat
gpt-5.1-codex-max	$1.25	$10.00	Flat
gpt-5.2	$1.75	$14.00	Flat
gpt-5.2-codex	$1.75	$14.00	Flat
gpt-5.2-pro	$21.00	$168.00	Flat
gpt-5.3-codex	$1.75	$14.00	Flat
gpt-5.4	$2.50 / $5.00	$15.00 / $22.50	Breakpoint	Threshold: 272K tokens
gpt-5.4-mini	$0.750	$4.50	Flat
gpt-5.4-nano	$0.200	$1.25	Flat
gpt-5.4-pro	$30.00 / $60.00	$180.00 / $270.00	Breakpoint	Threshold: 272K tokens
gpt-5.5	$5.00 / $10.00	$30.00 / $45.00	Breakpoint	Threshold: 272K tokens
gpt-5.5-pro	$30.00	$180.00	Flat
o1	$15.00	$60.00	Flat
o1-mini	$1.10	$4.40	Flat
o1-pro	$150.00	$600.00	Flat
o3	$2.00	$8.00	Flat
o3-deep-research	$10.00	$40.00	Flat
o3-mini	$1.10	$4.40	Flat
o3-pro	$20.00	$80.00	Flat
o4-mini	$1.10	$4.40	Flat
o4-mini-deep-research	$2.00	$8.00	Flat

How LLM Pricing Works

LLM APIs charge based on tokens — units of text that roughly correspond to ~4 characters or ~0.75 words in English. Pricing is split between input tokens (your prompt) and output tokens (the model's response), with output typically costing more.

Some models use breakpoint pricing, where rates increase above a certain context length (e.g., 200K tokens). Multimodal models may also have different rates for text, image, and audio modalities.

Tips to Reduce LLM Costs

Right-size your model: Use smaller models (Haiku, GPT-5 Nano) for simple tasks like classification or extraction.
Minimize prompt length: Remove unnecessary context and examples from system prompts.
Cache responses: Store and reuse results for identical or similar queries.
Use model routing: Route simple queries to cheap models and only escalate to expensive models when needed.
Monitor usage: Track costs per endpoint and model to identify optimization opportunities.

Frequently Asked Questions

How do LLM APIs charge for usage?

LLM APIs charge per token, with separate rates for input (prompt) tokens and output (completion) tokens. Prices are typically quoted per million tokens.

What is the cheapest LLM API?

The cheapest LLM APIs include GPT-5 Nano ($0.05/1M input), Gemini 2.0 Flash Lite ($0.075/1M input), and GPT-4.1 Nano ($0.10/1M input).

How can I reduce LLM API costs?

Key strategies include: choosing the right model size for your task, minimizing prompt length, caching frequent responses, batching requests, and using cheaper models for routing/classification before calling expensive models.

Automate Your Cost Estimation

Get programmatic access to real-time pricing for every model with our API. Or explore our LLM pricing comparison to find the best model for your budget.

Get Started Free