GPT-4o Mini Pricing: The Cheapest Way to Use OpenAI (April 2026)

GPT-4o Mini is OpenAI's cheapest model in the GPT-4 family at $0.15/$0.60 per million tokens. Launched July 2024, it handles classification, extraction, routing, and simple generation at 94% less than GPT-4o. Now competes with GPT-4.1 Nano ($0.10/$0.40) for the budget tier.

GPT-4o Mini

$0.15 / $0.60 per 1M input / output tokens

✓ Cheapest GPT-4 family model from OpenAI
✓ 128K context window
✓ Function calling and structured outputs
✓ Vision support for image inputs

GPT-4o Mini (Cached)

$0.075 / $0.60 per 1M input / output tokens

✓ 50% discount on input tokens with prompt caching
✓ Output tokens remain at standard rate
✓ Best for apps with repeated system prompts
✓ Cache hits automatic, no code changes needed

GPT-4o Mini (Batch)

$0.075 / $0.30 per 1M input / output tokens

✓ 50% off both input and output tokens
✓ 24-hour turnaround SLA
✓ Same quality as synchronous API
✓ Best for bulk classification and labeling

GPT-4o Mini (Fine-tuned)

$0.30 / $1.20 per 1M input / output tokens

✓ Double base price for custom fine-tuned models
✓ Training costs $0.30 per 1M tokens on top of inference
✓ Best for domain-specific classification tasks
✓ Requires at least 10 training examples

GPT-4o Mini Pricing Table (April 2026)

GPT-4o Mini pricing comparison showing input and output costs vs GPT-4o and GPT-4.1 Nano — GPT-4o Mini pricing comparison

Every GPT-4o Mini pricing variant in one table. All prices are per 1 million tokens.

Variant	Input	Cached Input	Output	Context
GPT-4o Mini	$0.15	$0.075	$0.60	128K
GPT-4o Mini (Batch)	$0.075	-	$0.30	128K
GPT-4o Mini (Fine-tuned)	$0.30	$0.15	$1.20	128K

GPT-4o Mini vs GPT-4.1 Nano: Which Budget Model Wins?

GPT-4.1 Nano launched as OpenAI's new ultra-cheap option and directly competes with GPT-4o Mini. Nano is 33% cheaper on both input and output tokens. Both models target the same use cases: classification, extraction, routing, and simple generation. The key differences are context window size (Nano supports 1M tokens versus Mini's 128K) and recency (Nano benefits from newer training data and architecture improvements). For most simple tasks, Nano is the better default. Mini still wins if you have fine-tuned models you do not want to retrain, or if your specific task benchmarks higher on Mini.

Feature	GPT-4o Mini	GPT-4.1 Nano
Input price	$0.15/1M	$0.10/1M
Output price	$0.60/1M	$0.40/1M
Context window	128K tokens	1M tokens
Batch input	$0.075/1M	$0.05/1M
Batch output	$0.30/1M	$0.20/1M
Fine-tuning	Available	Available
Vision	Yes	No
Best for	Legacy fine-tunes, vision	New projects, budget default

GPT-4o Mini vs GPT-4o: When to Upgrade

GPT-4o costs $2.50/$10.00 per million tokens, roughly 17x more than Mini on input and output. The quality gap matters most on complex reasoning, nuanced writing, multi-step tool use, and long-form content generation. Mini matches GPT-4o closely on simple classification (90%+ agreement on binary tasks) but falls behind on anything requiring chain-of-thought reasoning or subtle distinctions. A common architecture uses Mini for initial triage and routes only complex requests to GPT-4o or GPT-4.1, cutting costs by 70-80%.

Feature	GPT-4o Mini	GPT-4o
Input price	$0.15/1M	$2.50/1M
Output price	$0.60/1M	$10.00/1M
Context window	128K	128K
Complex reasoning	Weak	Strong
Classification	Strong	Strong
Vision	Yes	Yes
Audio	No	Yes
Function calling	Yes	Yes

Batch API Pricing for GPT-4o Mini

The Batch API processes requests asynchronously at 50% off both input and output tokens. GPT-4o Mini Batch pricing is $0.075 input and $0.30 output per million tokens. Jobs complete within 24 hours. Submit a JSONL file of Chat Completions requests and poll for results. Best for: dataset labeling, content classification, bulk extraction, and evaluation pipelines where latency is not critical.

Pricing Mode	Input/1M	Output/1M	Latency
Synchronous	$0.15	$0.60	Real-time
Batch	$0.075	$0.30	Up to 24 hours
Cached + Batch	$0.075	$0.30	Up to 24 hours

Fine-Tuning Costs for GPT-4o Mini

Fine-tuning GPT-4o Mini costs $0.30 per 1M training tokens. Once trained, inference costs double: $0.30 input and $1.20 output per million tokens. A typical fine-tuning job on 100K training tokens costs about $0.03 for training alone. The real cost is inference, if you run 1M requests per month, the 2x inference premium adds up fast. Consider whether prompt engineering or few-shot examples can achieve the same quality before committing to fine-tuning. Fine-tuned Mini models still cannot match the reasoning ability of base GPT-4.1 or GPT-4o.

Cost Component	Base Mini	Fine-tuned Mini
Training	-	$0.30/1M tokens
Input inference	$0.15/1M	$0.30/1M
Output inference	$0.60/1M	$1.20/1M
Cached input	$0.075/1M	$0.15/1M

Real-World Cost Examples

What GPT-4o Mini actually costs for common workloads at different scales. All estimates assume average token counts per request.

Use Case	Input Tokens	Output Tokens	Cost per Request	Cost per 1K Requests	Cost per 100K Requests
Chatbot reply	~500	~200	$0.0002	$0.20	$19.50
Text classification	~300	~20	$0.0001	$0.06	$5.70
Entity extraction	~800	~100	$0.0002	$0.18	$18.00
Content routing	~200	~10	$0.00004	$0.04	$3.60
Summarization (short doc)	~2,000	~300	$0.0005	$0.48	$48.00

Rate Limits for GPT-4o Mini

OpenAI sets rate limits based on your cumulative API spending. GPT-4o Mini shares the same tier system as other models but typically has higher token-per-minute allowances due to its lower cost. New accounts start at the free tier.

Tier	Requirement	RPM	TPM	RPD
Free	Verified account	500	200K	500
Tier 1	$5+ spent	500	200K	10K
Tier 2	$50+ spent	5K	2M	-
Tier 3	$100+ spent	5K	4M	-
Tier 4	$250+ spent	10K	10M	-
Tier 5	$1,000+ spent	30K	150M	-

When GPT-4o Mini Is NOT Enough

GPT-4o Mini fails or underperforms on several task categories. Complex multi-step reasoning consistently produces errors that GPT-4.1 or GPT-4o handle correctly. Nuanced writing tasks (tone matching, persuasive copy, creative fiction) show a clear quality gap. Long-context retrieval degrades past 32K tokens even though the 128K window technically accepts more. Multi-turn agent workflows with tool use produce more hallucinated function calls. Code generation for non-trivial tasks (refactoring, architecture changes, multi-file edits) misses edge cases. If your application falls into any of these categories, budget for a bigger model and use Mini only for the simple subtasks.

Hidden Costs & Gotchas

⚠ {'title': 'GPT-4.1 Nano is now cheaper', 'detail': "GPT-4.1 Nano costs $0.10/$0.40 per million tokens versus Mini's $0.15/$0.60. That is 33% cheaper on input and output. For simple classification and routing, Nano may be the better budget pick, benchmark your specific task before choosing."}
⚠ {'title': 'Output tokens cost 4x input tokens', 'detail': 'GPT-4o Mini charges $0.15 per 1M input tokens but $0.60 per 1M output tokens. Generation-heavy tasks like summarization, content writing, or code generation cost more than you expect if you budget only on input. Always estimate your output-to-input ratio.'}
⚠ {'title': 'Fine-tuning training costs add up', 'detail': 'Fine-tuning training costs $0.30 per 1M tokens, and inference on fine-tuned models is 2x the base rate ($0.30/$1.20). A training run on 10M tokens costs $3.00 before you make a single inference call. Multiple training iterations multiply that cost.'}
⚠ {'title': '128K context does not mean 128K quality', 'detail': 'GPT-4o Mini accepts 128K tokens of context, but quality degrades noticeably past 32K tokens. Long-context tasks like document Q&A over large files may produce worse results than splitting into smaller chunks. Test with your actual data before relying on full context length.'}
⚠ {'title': 'Vision adds image token costs', 'detail': 'Sending images to GPT-4o Mini converts them to tokens based on resolution. A low-res image costs roughly $0.002 and a high-res image costs $0.004-0.006 depending on dimensions. If you process thousands of images, these costs add up fast even at Mini pricing.'}
⚠ {'title': 'Rate limits start low on free tier', 'detail': 'Free tier accounts get 500 RPM for GPT-4o Mini. Tier 1 ($5+ spent) stays at 500 RPM with 30K TPM. You need to reach Tier 2 ($50+ spent) for 5K RPM. Production applications often hit rate limits before they hit cost ceilings.'}

Which Plan Do You Need?

Prototyping and hobby projects

GPT-4o Mini at standard rates ($0.15/$0.60). Cheap enough to iterate without worrying about cost. 1,000 requests with 500 input and 200 output tokens each costs about $0.20.

Classification and routing at scale

GPT-4o Mini Batch at $0.075/$0.30. If latency does not matter, the Batch API cuts costs in half. Ideal for labeling datasets, content moderation, and intent classification on bulk data.

Production with quality requirements

Upgrade to GPT-4.1 ($2/$8) or GPT-4o ($2.50/$10). Mini struggles with complex reasoning, nuanced writing, and multi-step tool use. If accuracy matters more than cost, step up to a full-size model.

Budget production workloads

Compare GPT-4.1 Nano ($0.10/$0.40) versus GPT-4o Mini ($0.15/$0.60). Nano is 33% cheaper. Run both on your actual prompts and pick whichever scores higher, the cost difference is small but compounds at scale.

The Bottom Line

GPT-4o Mini at $0.15/$0.60 is excellent for simple tasks but now faces competition from GPT-4.1 Nano ($0.10/$0.40). For classification and routing, both are fine, pick based on benchmark performance for your specific task. For anything requiring reasoning or nuance, step up to GPT-4.1 or GPT-4o.

Disclosure: Pricing information is sourced from official websites and may change. We update this page regularly but always verify current pricing on the vendor's site before purchasing.

Related Resources

Full OpenAI API pricing for all models → GPT-4o pricing and comparison → OpenAI API review and features → Anthropic API pricing (Claude models) → Best LLM frameworks for building apps →

Frequently Asked Questions

How much does GPT-4o Mini cost?

GPT-4o Mini costs $0.15 per 1 million input tokens and $0.60 per 1 million output tokens. With prompt caching, input drops to $0.075/1M. Batch API pricing is $0.075 input and $0.30 output per million tokens.

Is GPT-4o Mini free?

No. GPT-4o Mini is not free through the API. New OpenAI accounts get $5 in credits which can be used for Mini. ChatGPT Free includes limited Mini access in the consumer product, but API usage always costs money.

GPT-4o Mini vs GPT-4.1 Nano, which is cheaper?

GPT-4.1 Nano is cheaper at $0.10/$0.40 per million tokens versus Mini's $0.15/$0.60. Nano is 33% less on both input and output. For new projects, benchmark both on your task. Nano is the better default unless Mini specifically outperforms.

Can I fine-tune GPT-4o Mini?

Yes. Fine-tuning GPT-4o Mini costs $0.30 per 1M training tokens. Inference on fine-tuned models costs $0.30 input and $1.20 output per million tokens, double the base rate. You need at least 10 training examples, though 50-100 examples typically produce better results.

What is GPT-4o Mini good for?

GPT-4o Mini excels at classification, entity extraction, content routing, simple Q&A, and any task where the output is short and structured. It matches GPT-4o closely on binary classification tasks while costing 94% less.

What is the GPT-4o Mini context window?

GPT-4o Mini supports 128K tokens of context (roughly 96K words). However, quality degrades past 32K tokens for most tasks. If you need reliable long-context performance, GPT-4.1 Nano offers 1M tokens.

Is GPT-4o Mini good for coding?

GPT-4o Mini handles simple code tasks like boilerplate generation, syntax fixes, and code explanation. It struggles with complex refactoring, multi-file changes, and architectural decisions. For serious coding, use GPT-4.1 or Claude Sonnet 4.6.

How does GPT-4o Mini compare to Claude Haiku?

Claude Haiku 4.5 costs $0.80/$4.00 per million tokens, significantly more expensive than GPT-4o Mini at $0.15/$0.60. Mini is roughly 5x cheaper on input and 7x cheaper on output. Haiku offers a 200K context window and is generally stronger on nuanced text tasks, but Mini wins on price for simple workloads.