LLM Cost Calculator
Free token cost calculator for every major LLM API. Compare pricing across OpenAI, Anthropic, and Google models — or browse the full pricing comparison.
Sign up to use the calculator
Sign Up FreeHow Token Pricing Works
LLM APIs charge per token — a unit of text roughly equal to 4 characters or 0.75 words. Every API call has two cost components: input tokens (your prompt) and output tokens (the model's response). Output tokens almost always cost more, typically 2-8x the input price. Use this token cost calculator to estimate your spend before committing to a provider.
LLM Pricing Comparison Table
| Model | Input $/1M tokens | Output $/1M tokens | Type | Notes |
|---|---|---|---|---|
| claude-3-7-sonnet | $3.00 | $15.00 | Flat | |
| claude-haiku-3 | $0.250 | $1.25 | Flat | |
| claude-haiku-3-5 | $0.800 | $4.00 | Flat | |
| claude-haiku-4-5 | $1.00 | $5.00 | Flat | |
| claude-opus-3 | $15.00 | $75.00 | Flat | |
| claude-opus-4-0 | $15.00 | $75.00 | Flat | |
| claude-opus-4-1 | $15.00 | $75.00 | Flat | |
| claude-opus-4-5 | $5.00 | $25.00 | Flat | |
| claude-opus-4-6 | $5.00 / $10.00 | $25.00 / $37.50 | Breakpoint | Threshold: 200K tokens |
| claude-sonnet-4-0 | $3.00 / $6.00 | $15.00 / $22.50 | Breakpoint | Threshold: 200K tokens |
| claude-sonnet-4-5 | $3.00 / $6.00 | $15.00 / $22.50 | Breakpoint | Threshold: 200K tokens |
| gemini-2.0-flash | $0.100 | $0.400 | Flat | |
| gemini-2.0-flash-lite | $0.075 | $0.300 | Flat | |
| gemini-2.5-computer-use | $1.25 / $2.50 | $10.00 / $15.00 | Breakpoint | Threshold: 200K tokens |
| gemini-2.5-flash | $0.300 | $2.50 | Flat | |
| gemini-2.5-flash-image | $0.300 | $2.50 | Multimodal | text, image |
| gemini-2.5-flash-lite | $0.100 | $0.400 | Flat | |
| gemini-2.5-flash-native-audio | $0.500 | $2.00 | Multimodal | text, audio |
| gemini-2.5-flash-preview-tts | $0.500 | $10.00 | Flat | |
| gemini-2.5-pro | $1.25 / $2.50 | $10.00 / $15.00 | Breakpoint | Threshold: 200K tokens |
| gemini-2.5-pro-preview-tts | $1.00 | $20.00 | Flat | |
| gemini-3-flash | $0.500 | $3.00 | Flat | |
| gemini-3-pro-image-preview | $2.00 | $12.00 | Multimodal | text, image |
| gemini-3-pro-preview | $2.00 / $4.00 | $12.00 / $18.00 | Breakpoint | Threshold: 200K tokens |
| gemini-3.1-pro-preview | $2.00 / $4.00 | $12.00 / $18.00 | Breakpoint | Threshold: 200K tokens |
| gpt-4.1 | $2.00 | $8.00 | Flat | |
| gpt-4.1-mini | $0.400 | $1.60 | Flat | |
| gpt-4.1-nano | $0.100 | $0.400 | Flat | |
| gpt-4o | $2.50 | $10.00 | Flat | |
| gpt-4o-mini | $0.150 | $0.600 | Flat | |
| gpt-5 | $1.25 | $10.00 | Flat | |
| gpt-5-codex | $1.25 | $10.00 | Flat | |
| gpt-5-mini | $0.250 | $2.00 | Flat | |
| gpt-5-nano | $0.050 | $0.400 | Flat | |
| gpt-5-pro | $15.00 | $120.00 | Flat | |
| gpt-5.1 | $1.25 | $10.00 | Flat | |
| gpt-5.1-codex | $1.25 | $10.00 | Flat | |
| gpt-5.1-codex-max | $1.25 | $10.00 | Flat | |
| gpt-5.2 | $1.75 | $14.00 | Flat | |
| gpt-5.2-codex | $1.75 | $14.00 | Flat | |
| gpt-5.2-pro | $21.00 | $168.00 | Flat | |
| gpt-5.3-codex | $1.75 | $14.00 | Flat | |
| gpt-5.4 | $2.50 / $5.00 | $15.00 / $22.50 | Breakpoint | Threshold: 272K tokens |
| gpt-5.4-pro | $30.00 / $60.00 | $180.00 / $270.00 | Breakpoint | Threshold: 272K tokens |
| o1 | $15.00 | $60.00 | Flat | |
| o1-mini | $1.10 | $4.40 | Flat | |
| o1-pro | $150.00 | $600.00 | Flat | |
| o3 | $2.00 | $8.00 | Flat | |
| o3-deep-research | $10.00 | $40.00 | Flat | |
| o3-mini | $1.10 | $4.40 | Flat | |
| o3-pro | $20.00 | $80.00 | Flat | |
| o4-mini | $1.10 | $4.40 | Flat | |
| o4-mini-deep-research | $2.00 | $8.00 | Flat |
How LLM Pricing Works
LLM APIs charge based on tokens — units of text that roughly correspond to ~4 characters or ~0.75 words in English. Pricing is split between input tokens (your prompt) and output tokens (the model's response), with output typically costing more.
Some models use breakpoint pricing, where rates increase above a certain context length (e.g., 200K tokens). Multimodal models may also have different rates for text, image, and audio modalities.
Tips to Reduce LLM Costs
- Right-size your model: Use smaller models (Haiku, GPT-5 Nano) for simple tasks like classification or extraction.
- Minimize prompt length: Remove unnecessary context and examples from system prompts.
- Cache responses: Store and reuse results for identical or similar queries.
- Use model routing: Route simple queries to cheap models and only escalate to expensive models when needed.
- Monitor usage: Track costs per endpoint and model to identify optimization opportunities.
Frequently Asked Questions
How do LLM APIs charge for usage?
LLM APIs charge per token, with separate rates for input (prompt) tokens and output (completion) tokens. Prices are typically quoted per million tokens.
What is the cheapest LLM API?
The cheapest LLM APIs include GPT-5 Nano ($0.05/1M input), Gemini 2.0 Flash Lite ($0.075/1M input), and GPT-4.1 Nano ($0.10/1M input).
How can I reduce LLM API costs?
Key strategies include: choosing the right model size for your task, minimizing prompt length, caching frequent responses, batching requests, and using cheaper models for routing/classification before calling expensive models.
Automate Your Cost Estimation
Get programmatic access to real-time pricing for every model with our API. Or explore our LLM pricing comparison to find the best model for your budget.
Get Started Free