A token is the basic unit of text that AI language models process. Tokens can be whole words, parts of words, or punctuation. In English, 1 token ≈ 4 characters or 0.75 words. "Hello world" = 2 tokens. "Tokenization" = 3 tokens: "Token", "ization", ".". API costs are priced per token.
What exactly is a token?
When you type a prompt into ChatGPT, Claude, or any large language model (LLM), the AI doesn't see words the way a human does. Instead, it uses a process called Byte Pair Encoding (BPE) to break text down into tokens.
Why don't models use words directly? Because languages are complex. There are millions of words, conjugations, misspellings, and names. By breaking text down into highly recurrent subwords (tokens), an AI can significantly reduce the vocabulary it needs to understand everything, down to just 100,000 or 200,000 distinct components.
Visual Example
Tokenization.
Why does token count matter?
API pricing is per token: You don't pay per API call, you pay precisely for however many input tokens you submit, and however many output tokens the model generates.
Context windows are measured in tokens: The "memory limit" of the AI (e.g., 128K for GPT-4o, 2M for Gemini 2.5 Pro) determines how large of a document you can upload at once.
Output length is limited: Models usually cap maximum generation length to roughly 4,000 to 8,000 output tokens.
How do tokens differ between models?
Tokens are not universal. Because OpenAI, Anthropic, and Google all trained their models differently, they each use unique dictionaries. The exact same text will use a different number of tokens depending on the model.
Model
Tokenizer
Vocab size
"Hello, how are you?"
GPT-4o / GPT-4.1
o200k_base
200,000
6 tokens
GPT-3.5
cl100k_base
100,000
6 tokens
Claude Sonnet 4.6
Anthropic BPE
~100K
~6 tokens
Gemini 2.5 Pro
SentencePiece
~256K
~5 tokens
Tokens in different languages
Most modern tokenizers are highly optimized for English, meaning English text is very cost-efficient (about 1 token per 4 characters).
However, for languages like Hindi, Arabic, or Korean, the same meaning requires significantly more tokens because those characters appear less frequently in the training data. This makes LLMs fundamentally more expensive to use in non-English contexts.
How to count tokens for free
You don't need to write code to calculate tokens. You can use our real-time interactive calculator right now to see exactly how your prompt is tokenized before you spend any money on API calls.
▾
OpenAI
▾
Anthropic
▾
Google
▾
DeepSeek
▾
Meta
▾
Mistral
▾
0 chars
0
Tokens
0
Words
0
Chars
$0.00
Input Cost
Context: 0 / 272.0K tokens(0.0%)
INPUT$0.0000
OUTPUT+$0.0000 (EST)
TOTAL$0.00000
🎨Token Visualizer
Type text above to see tokenization…
Cost Estimate by Provider
Based on your current token count — pick a model per provider and compare side by side.
OpenAI
▾
Input$0.00
Cached Input$0.00
Output$0.00
EST. TOTAL$0.00
Anthropic
▾
Input$0.00
Cached Input$0.00
Cache Write (5-min)$0.00
Cache Write (1-hr)$0.00
Output$0.00
EST. TOTAL$0.00
Google
▾
Input$0.00
Cached Input$0.00
Output$0.00
EST. TOTAL$0.00
DeepSeek
▾
Input$0.00
Cached Input$0.00
Output$0.00
EST. TOTAL$0.00
Meta
▾
Input$0.00
Output$0.00
EST. TOTAL$0.00
Mistral
▾
Input$0.00
Output$0.00
EST. TOTAL$0.00
Perplexity
▾
Input$0.00
Output$0.00
EST. TOTAL$0.00
xAI
▾
Input$0.00
Output$0.00
EST. TOTAL$0.00
Qwen
▾
Input$0.00
Output$0.00
EST. TOTAL$0.00
💰 MONTHLY COST PROJECTOR
Requests/day1.0K
Input tokens1.0K
Output tokens500
Model
Monthly cost
Annual cost
Llama 4 Scout
$8.40
$100.80
GPT-4.1 Nano
$9.00
$108.00
Gemini 2.5 Flash-Lite
$9.00
$108.00
GPT-4o Mini
$13.50
$162.00
DeepSeek V3
$14.70
$176.40
Llama 4 Maverick
$15.00
$180.00
GPT-4.1 Mini
$36.00
$432.00
Gemini 2.5 Flash
$46.50
$558.00
DeepSeek R1
$49.35
$592.20
o4-mini
$99.00
$1188.00
Claude Haiku 4.5
$105.00
$1260.00
Gemini 1.5 Pro
$112.50
$1350.00
GPT-4.1
$180.00
$2160.00
o3
$180.00
$2160.00
Gemini 2.5 Pro
$187.50
$2250.00
GPT-4o
$225.00
$2700.00
Claude Sonnet 4.6
$315.00
$3780.00
Claude Opus 4.7
$525.00
$6300.00
Claude Opus 4.6
$525.00
$6300.00
o3-pro
$1800.00
$21600.00
* Multiply monthly cost ×12 for annual estimate
✦Best value for this usage: Llama 4 Scout ($8.40/mo)
Frequently Asked Questions
What is 1 token in ChatGPT?
In ChatGPT, 1 token is roughly equivalent to 4 characters or 0.75 English words. Tokens are the basic pieces of text that the AI model processes.
How many tokens is 1000 words?
On average, 1000 words is approximately 1,333 tokens in English when using standard tokenizers like OpenAI's cl100k_base or o200k_base.
How much does 1 million tokens cost?
It depends heavily on the model. GPT-4o costs $2.50 for 1M input tokens, Gemini 2.5 Pro costs $1.25, and GPT-4.1 Nano costs just $0.10 per 1M tokens.