Welcome to the March 2026 LLM Pricing Index. This month saw Anthropic solidify its position with Claude Sonnet 4.6, DeepSeek maintaining its aggressive budget pricing, and OpenAI holding steady with GPT-4o. Track all costs here.
The Price Per Million Tokens
Prices are listed in USD per 1 Million tokens. Sorted from least expensive to most expensive input cost.
| Model | Provider | Input / 1M | Output / 1M |
|---|
| Ministral 3B | Mistral | $0.040 | $0.04 |
| GPT-5 Nano | OpenAI | $0.050 | $0.40 |
| Gemini 1.5 Flash | Google | $0.075 | $0.30 |
| GPT-4.1 Nano | OpenAI | $0.10 | $0.40 |
| Gemini 2.5 Flash-Lite | Google | $0.10 | $0.40 |
| Gemini 2.0 Flash | Google | $0.10 | $0.40 |
| Mistral Small 3 | Mistral | $0.10 | $0.30 |
| Ministral 8B | Mistral | $0.10 | $0.10 |
| Llama 4 Scout | Meta | $0.11 | $0.34 |
| GPT-4o Mini | OpenAI | $0.15 | $0.60 |
| Mistral Nemo | Mistral | $0.15 | $0.15 |
| Pixtral 12B | Mistral | $0.15 | $0.15 |
| GPT-5.4 Nano | OpenAI | $0.20 | $1.25 |
| Llama 4 Maverick | Meta | $0.20 | $0.60 |
| Codestral | Mistral | $0.20 | $0.60 |
| Sonar Small | Perplexity | $0.20 | $0.20 |
| Grok 4.1 Fast | xAI | $0.20 | $0.50 |
| Qwen 2.5 72B | Qwen | $0.23 | $0.40 |
| GPT-5 Mini | OpenAI | $0.25 | $2.00 |
| Claude Haiku 3 | Anthropic | $0.25 | $1.25 |
| Gemini 3.1 Flash-Lite | Google | $0.25 | $1.50 |
| DeepSeek V3 | DeepSeek | $0.28 | $0.42 |
| Gemini 2.5 Flash | Google | $0.30 | $2.50 |
| GPT-4.1 Mini | OpenAI | $0.40 | $1.60 |
| GPT-3.5 Turbo | OpenAI | $0.50 | $1.50 |
| Gemini 3 Flash | Google | $0.50 | $3.00 |
| Qwen 3.5 Plus | Qwen | $0.50 | $2.00 |
| DeepSeek R1 | DeepSeek | $0.55 | $2.19 |
| LLaMA 3.3 70B | Meta | $0.59 | $0.79 |
| GPT-5.4 Mini | OpenAI | $0.75 | $4.50 |
| Claude Haiku 3.5 | Anthropic | $0.80 | $4.00 |
| Claude Haiku 4.5 | Anthropic | $1.00 | $5.00 |
| Sonar Large | Perplexity | $1.00 | $1.00 |
| o4-mini | OpenAI | $1.10 | $4.40 |
| o3-mini | OpenAI | $1.10 | $4.40 |
| o1-mini | OpenAI | $1.10 | $4.40 |
| GPT-5.1 | OpenAI | $1.25 | $10.00 |
| GPT-5 | OpenAI | $1.25 | $10.00 |
| Gemini 2.5 Pro | Google | $1.25 | $10.00 |
| Gemini 1.5 Pro | Google | $1.25 | $5.00 |
| Grok 4.3 | xAI | $1.25 | $2.50 |
| Grok 4.20 | xAI | $1.25 | $2.50 |
| GPT-5.2 | OpenAI | $1.75 | $14.00 |
| GPT-4.1 | OpenAI | $2.00 | $8.00 |
| o3 | OpenAI | $2.00 | $8.00 |
| Gemini 3.1 Pro | Google | $2.00 | $12.00 |
| Mistral Large 3 | Mistral | $2.00 | $6.00 |
| Pixtral Large | Mistral | $2.00 | $6.00 |
| GPT-5.4 | OpenAI | $2.50 | $15.00 |
| GPT-4o | OpenAI | $2.50 | $10.00 |
| Qwen 3.7 Max | Qwen | $2.50 | $7.50 |
| Claude Sonnet 4.6 | Anthropic | $3.00 | $15.00 |
| Claude Sonnet 4.5 | Anthropic | $3.00 | $15.00 |
| Claude Sonnet 4 | Anthropic | $3.00 | $15.00 |
| Claude Sonnet 3.7 | Anthropic | $3.00 | $15.00 |
| Sonar Pro | Perplexity | $3.00 | $15.00 |
| Claude Opus 4.7 | Anthropic | $5.00 | $25.00 |
| Claude Opus 4.6 | Anthropic | $5.00 | $25.00 |
| Claude Opus 4.5 | Anthropic | $5.00 | $25.00 |
| Sonar Huge | Perplexity | $5.00 | $5.00 |
| GPT-5 Pro | OpenAI | $15.00 | $120.00 |
| o1 | OpenAI | $15.00 | $60.00 |
| Claude Opus 4.1 | Anthropic | $15.00 | $75.00 |
| Claude Opus 4 | Anthropic | $15.00 | $75.00 |
| Claude Opus 3 | Anthropic | $15.00 | $75.00 |
| o3-pro | OpenAI | $20.00 | $80.00 |
| GPT-5.2 Pro | OpenAI | $21.00 | $168.00 |
| GPT-5.4 Pro | OpenAI | $30.00 | $180.00 |
| o1-pro | OpenAI | $150.00 | $600.00 |
Key Insights for March 2026
1. The Gap Between Flagship and "Mini" Models Has Widened
GPT-4o Mini ($0.15 input) is now a staggering 94% cheaper than GPT-4o ($2.50 input). For 80% of pipeline tasks (classification, basic extraction), developers are abandoning flagship models entirely.
2. DeepSeek Remains the Price/Performance King
At just $0.27 per 1M input tokens, DeepSeek V3 costs less than 10% of Claude Sonnet and GPT-4o while often matching their performance on coding and translation tasks.
3. Context Windows Determine RAG Costs
Filling Gemini 1.5 Pro's 2M context window costs exactly $2.50 per request. While powerful, doing this across 1,000 queries heavily outweighs the cost of setting up a proper vector database for RAG.