GPT-4o Pricing: What It Costs Per Token (April 2026)

GPT-4o launched in May 2024 at $5/$15 per million tokens and got a 50% price cut in October 2024. It now costs $2.50 input and $10 output per million tokens. OpenAI has since released GPT-4.1 as the recommended production replacement, but GPT-4o remains available and widely used. If you are running GPT-4o in production or evaluating whether to switch, this page covers everything: current per-token pricing, batch and caching discounts, comparison to GPT-4.1, and real-world cost estimates.

GPT-4o

$2.50 / $10.00 per 1M input / output tokens

✓ Multimodal: text, image, and audio input
✓ 128K context window
✓ Function calling and structured outputs
✓ Vision capabilities included at no extra charge
✓ 50% cached input discount ($1.25/1M)

GPT-4o Mini

$0.15 / $0.60 per 1M input / output tokens

✓ Cheapest GPT-4o variant
✓ 128K context window
✓ Good for classification, extraction, simple tasks
✓ Function calling supported
✓ 50% cached input discount ($0.075/1M)

GPT-4o Audio

$40.00 / $80.00 per 1M audio tokens

✓ Native audio input and output
✓ Text tokens still at standard GPT-4o rates
✓ Audio tokens are significantly more expensive
✓ Best for voice-first applications

GPT-4o (Batch)

$1.25 / $5.00 per 1M input / output tokens

✓ 50% discount on standard pricing
✓ 24-hour turnaround SLA
✓ Same quality as synchronous API
✓ Best for bulk processing and evaluation

GPT-4o Pricing Table (April 2026)

GPT-4o pricing comparison showing input and output costs versus GPT-4.1 and mini models — GPT-4o pricing comparison

Here is every GPT-4o variant with current per-token pricing. All prices are per 1 million tokens unless noted otherwise.

Model	Input	Cached Input	Output	Context
GPT-4o	$2.50	$1.25	$10.00	128K
GPT-4o Mini	$0.15	$0.075	$0.60	128K
GPT-4o Audio (text)	$2.50	$1.25	$10.00	128K
GPT-4o Audio (audio)	$40.00	-	$80.00	128K
GPT-4o (Batch)	$1.25	-	$5.00	128K
GPT-4o Mini (Batch)	$0.075	-	$0.30	128K

GPT-4o Price History

GPT-4o launched in May 2024 at $5.00 input and $15.00 output per million tokens. In October 2024, OpenAI cut prices by 50% to the current $2.50/$10.00. GPT-4o Mini launched in July 2024 at $0.15/$0.60 and has not changed since. The price cuts made GPT-4o competitive with Claude 3.5 Sonnet at the time, though both providers have since released newer models.

Date	Model	Input/1M	Output/1M	Change
May 2024	GPT-4o	$5.00	$15.00	Launch price
Jul 2024	GPT-4o Mini	$0.15	$0.60	New model
Oct 2024	GPT-4o	$2.50	$10.00	-50%
Jan 2025	GPT-4.1	$2.00	$8.00	New (replaced 4o)

GPT-4o vs GPT-4.1: Which Should You Use?

GPT-4.1 replaced GPT-4o as OpenAI's recommended production model in January 2025. It is cheaper ($2.00/$8.00 vs $2.50/$10.00), scores higher on coding benchmarks (SWE-bench), handles instruction following better, and supports a 1 million token context window versus GPT-4o's 128K. The main reason to stay on GPT-4o is vision support. GPT-4.1 does not accept image inputs. For text-only applications, GPT-4.1 is strictly better on price and performance. The API interface is identical, so migration requires only changing the model parameter from gpt-4o to gpt-4.1.

Feature	GPT-4o	GPT-4.1
Input price	$2.50/1M	$2.00/1M
Output price	$10.00/1M	$8.00/1M
Context window	128K tokens	1M tokens
Vision support	Yes	No
Recommended for	Multimodal	Production text
Batch discount	50%	50%
Cache discount	50%	75%

GPT-4o vs GPT-4o Mini: When to Use Each

GPT-4o Mini costs 94% less than GPT-4o on input tokens and 94% less on output tokens. For tasks that do not require GPT-4o's full capabilities, classification, entity extraction, simple Q&A, content routing. Mini delivers comparable results at a fraction of the cost. The quality gap shows up on complex reasoning, subtle writing, and multi-step tool use. A common pattern is to use Mini for initial filtering or classification and route only complex requests to the full GPT-4o or GPT-4.1 model. This hybrid approach can cut API costs by 80% or more depending on your traffic mix.

Batch API: Half-Price GPT-4o

OpenAI's Batch API processes requests asynchronously at 50% off. GPT-4o Batch pricing is $1.25 input and $5.00 output per million tokens. Jobs complete within 24 hours. The Batch API accepts the same request format as the synchronous Chat Completions API, making integration straightforward. Best for: bulk content generation, large-scale data labeling, evaluation pipelines, and any workload where latency is not critical. You submit a JSONL file of requests and poll for results.

Model	Sync Input	Batch Input	Sync Output	Batch Output
GPT-4o	$2.50	$1.25	$10.00	$5.00
GPT-4o Mini	$0.15	$0.075	$0.60	$0.30

Prompt Caching and How It Reduces GPT-4o Costs

OpenAI automatically caches the prefix of your prompt. When a subsequent request starts with the same tokens, cached input tokens are billed at 50% off ($1.25/1M instead of $2.50/1M for GPT-4o). Caching works best with long, stable system prompts and few-shot examples. The cache has a short TTL, typically a few minutes, so it helps most with high-throughput applications making frequent similar requests. Structure your prompts with static content first and dynamic content last to maximize cache hits.

Real-World Cost Examples

Here is what GPT-4o actually costs for common use cases, assuming average token counts per request.

Use Case	Input Tokens	Output Tokens	Cost per Request	Cost per 1K Requests
Chatbot reply	~500	~200	$0.0033	$3.25
Document summary (5 pages)	~3,000	~500	$0.0125	$12.50
Code generation	~1,000	~800	$0.0105	$10.50
Image analysis	~1,500	~300	$0.0068	$6.75
RAG with 10K context	~10,000	~500	$0.0300	$30.00

Rate Limits by Spending Tier

OpenAI sets rate limits based on your total API spending. New accounts start at the free tier with modest limits. As you spend more, limits increase automatically. For GPT-4o specifically:

Tier	Requirement	RPM	TPM	RPD
Free	Verified account	500	30K	500
Tier 1	$5+ spent	500	30K	10K
Tier 2	$50+ spent	5K	450K	n/a
Tier 3	$100+ spent	5K	800K	n/a
Tier 4	$250+ spent	10K	2M	n/a
Tier 5	$1,000+ spent	10K	30M	n/a

How to Estimate Your Monthly GPT-4o Bill

Multiply your average input tokens per request by your request volume, then divide by 1 million and multiply by $2.50. Do the same for output tokens at $10.00 per million. Example: 10,000 requests/day averaging 1,000 input and 300 output tokens each = 10M input tokens + 3M output tokens per day. Monthly: 300M input ($750) + 90M output ($900) = $1,650/month at standard rates, or $825/month with Batch API. Add caching if your prompts have stable prefixes.

GPT-4o vs Claude Sonnet 4.6: Cross-Provider Comparison

Anthropic's Claude Sonnet 4.6 is the closest competitor to GPT-4o. Claude Sonnet 4.6 costs $3.00 input and $15.00 output per million tokens, more expensive than GPT-4o on both dimensions. Claude offers a 200K context window versus GPT-4o's 128K. Performance is comparable on most benchmarks, with Claude generally stronger on long-form writing and analysis, and GPT-4o stronger on structured output and tool use. For cost-sensitive applications, GPT-4o (or better yet, GPT-4.1) has the edge.

Model	Input/1M	Output/1M	Context	Vision
GPT-4o	$2.50	$10.00	128K	Yes
GPT-4.1	$2.00	$8.00	1M	No
Claude Sonnet 4.6	$3.00	$15.00	200K	Yes
Claude Haiku 4.5	$0.80	$4.00	200K	Yes
Gemini 2.0 Flash	$0.10	$0.40	1M	Yes

Migration Guide: GPT-4o to GPT-4.1

If you are running GPT-4o for text-only applications, switching to GPT-4.1 saves 20% on every API call with no quality loss. The migration is simple: change the model parameter from "gpt-4o" to "gpt-4.1" in your API calls. Both models use the same Chat Completions API format, support the same tools (function calling, structured outputs, JSON mode), and accept the same parameters. Test with a small sample first to verify output quality meets your requirements, then switch over. The only blocker is if you rely on vision input. GPT-4.1 does not support images.

Hidden Costs & Gotchas

⚠ {'title': 'GPT-4.1 is cheaper for most use cases', 'detail': "GPT-4.1 costs $2.00/$8.00 per 1M tokens versus GPT-4o's $2.50/$10.00. That's 20% cheaper on input and output. Unless you need vision or audio specifically, GPT-4.1 is the better deal with stronger benchmark performance."}
⚠ {'title': 'Output tokens cost 4x input tokens', 'detail': 'GPT-4o charges $2.50 per 1M input tokens but $10.00 per 1M output tokens. Applications that generate long responses (summaries, articles, code) will see output dominate the bill. Budget based on your output-to-input ratio.'}
⚠ {'title': 'Vision queries use image tokens', 'detail': 'Sending images to GPT-4o converts them to tokens based on resolution. A 1024x1024 image uses roughly 765 tokens. High-res mode can use 2-3x more. Vision is included in the base price, but the token count adds up fast.'}
⚠ {'title': 'Audio tokens are 16x more expensive', 'detail': "GPT-4o audio input costs $40 per 1M tokens and output costs $80 per 1M tokens. That's 16x and 8x the text token prices respectively. For voice applications, calculate audio token costs separately."}
⚠ {'title': 'Prompt caching only helps on repeated prefixes', 'detail': 'The 50% cached input discount applies only when the beginning of your prompt matches a recent request exactly. System prompts and few-shot examples benefit most. Dynamic user inputs typically do not get cached.'}
⚠ {'title': 'Rate limits depend on your spending tier', 'detail': "Free tier users get 500 RPM. Tier 1 (\\ spent) gets 500 RPM and 30K TPM. Tier 5 (\\+ spent) gets 10K RPM and 30M TPM. Your effective throughput depends on your account's spending history."}

Which Plan Do You Need?

Prototyping or light usage

GPT-4o Mini at $0.15/$0.60 per 1M tokens. 97% cheaper than standard GPT-4o for simple tasks. Use this for classification, extraction, routing, and early-stage development.

Production text applications

Migrate to GPT-4.1 ($2.00/$8.00). It's 20% cheaper than GPT-4o, benchmarks higher on coding and instruction following, and has a 1M token context window. GPT-4o is no longer the recommended production model.

Vision and multimodal workflows

GPT-4o is still the go-to for image understanding. GPT-4.1 does not support vision input. If your application sends images, stay on GPT-4o or evaluate o4-mini for reasoning-heavy vision tasks.

Bulk processing and evaluation

Batch API at $1.25/$5.00 per 1M tokens. Half the cost of synchronous GPT-4o, with 24-hour turnaround. Ideal for data labeling, content generation, and test suite evaluation.

The Bottom Line

GPT-4o at $2.50/$10.00 per million tokens is no longer OpenAI's best value for text-only applications. GPT-4.1 is 20% cheaper and performs better on most benchmarks. But GPT-4o remains the right choice for vision and multimodal workflows where GPT-4.1 has no support. If you're starting fresh, default to GPT-4.1 for text and GPT-4o for images. If you're already running GPT-4o, the migration is straightforward since they share the same API format.

Disclosure: Pricing information is sourced from official websites and may change. We update this page regularly but always verify current pricing on the vendor's site before purchasing.

Related Resources

Full OpenAI API pricing for all models → OpenAI API review and features → OpenAI vs Anthropic API pricing comparison → Best LLM frameworks for building with GPT-4o → Anthropic API pricing (Claude models) →

Frequently Asked Questions

How much does GPT-4o cost per token?

GPT-4o costs $2.50 per 1 million input tokens and $10.00 per 1 million output tokens. With prompt caching, input tokens drop to $1.25 per million. Batch API pricing is $1.25 input and $5.00 output per million tokens.

Is GPT-4o cheaper than GPT-4?

Yes. GPT-4o is significantly cheaper than the original GPT-4 ($30/$60 per 1M tokens) and GPT-4 Turbo ($10/$30 per 1M tokens). GPT-4o costs $2.50/$10.00, roughly 75% cheaper than GPT-4 Turbo.

Should I use GPT-4o or GPT-4.1?

Use GPT-4.1 for text-only applications, it is 20% cheaper and benchmarks higher. Use GPT-4o only if you need vision (image input) or audio capabilities, which GPT-4.1 does not support.

What is GPT-4o Mini and when should I use it?

GPT-4o Mini costs $0.15/$0.60 per million tokens, 94% cheaper than GPT-4o. Use it for simple tasks like classification, extraction, routing, and early-stage development. Switch to GPT-4o or GPT-4.1 for complex reasoning and subtle generation.

Does GPT-4o have a free tier?

The OpenAI API does not have a free tier for GPT-4o. New accounts get $5 in credits. ChatGPT Free includes limited GPT-4o access, but that is the consumer product, not the API.

How does GPT-4o batch pricing work?

Submit a JSONL file of requests to the Batch API endpoint. Processing completes within 24 hours. Pricing is 50% off standard rates: $1.25 input and $5.00 output per million tokens. Same quality as synchronous calls.

Is GPT-4o being deprecated?

OpenAI has not announced a deprecation date for GPT-4o. It is no longer the recommended production model (GPT-4.1 replaced it), but it remains available. OpenAI typically gives at least 6 months notice before deprecating models.

How much does it cost to analyze an image with GPT-4o?

Image analysis uses GPT-4o's standard text token pricing plus additional tokens for the image. A 1024x1024 image uses roughly 765 tokens ($0.002). High-res images use more tokens. There is no separate per-image charge.

What is the context window for GPT-4o?

GPT-4o supports 128K tokens of context (roughly 96K words). Both GPT-4o and GPT-4o Mini share this 128K limit. If you need more context, GPT-4.1 supports 1 million tokens.

How does GPT-4o pricing compare to Claude?

GPT-4o ($2.50/$10.00) is cheaper than Claude Sonnet 4.6 ($3.00/$15.00) on both input and output. Claude Haiku 4.5 ($0.80/$4.00) is cheaper than GPT-4o but more expensive than GPT-4o Mini ($0.15/$0.60).

OpenAI GPT-4o pricing for April 2026?

April 2026: GPT-4o at $2.50/$10 per million tokens (grandfathered legacy for existing users), GPT-4o Mini at $0.15/$0.60. New API integrations default to GPT-4.1 family at $5/$15 (full) or $0.40/$1.60 (Mini). Batch API: 50% off. Prompt caching: up to 90% off cached input. GPT-4o status: legacy/grandfathered for existing users, replaced by GPT-4.1 family for new deployments.

OpenAI GPT-4o pricing per million tokens in 2026?

April 2026 per-million-token rates: GPT-4o at $2.50/$10 (grandfathered for existing users), GPT-4o Mini at $0.15/$0.60. New users default to GPT-4.1 ($5/$15), GPT-4.1 Mini ($0.40/$1.60), GPT-4.1 Nano ($0.10/$0.40). Batch API: 50% off all models. Prompt caching: up to 90% off cached input. GPT-4o tier preserved for existing users; new deployments should default to GPT-4.1 family.

OpenAI GPT-4o pricing per 1M tokens vs GPT-4.1 in 2026?

GPT-4o legacy: $2.50/$10 per million tokens. GPT-4.1 standard: $5/$15. GPT-4o is ~50% cheaper on input and ~33% cheaper on output. GPT-4o is grandfathered legacy pricing while GPT-4.1 is the current flagship for new integrations. For existing GPT-4o users with stable workloads, legacy pricing makes migration to GPT-4.1 cost-prohibitive unless improved capabilities justify it. For new integrations, evaluate GPT-4.1 Mini at $0.40/$1.60 first because it often matches GPT-4o quality on standard tasks at lower cost than GPT-4o legacy.

Has GPT-4o pricing changed in 2026?

No changes through April 2026. GPT-4o held at the legacy $2.50/$10 per million tokens rate. The tier moved to grandfathered legacy status with the GPT-4.1 family launch in January 2026, meaning new API users do not see GPT-4o as a recommended endpoint. Existing GPT-4o integrations continue without disruption. OpenAI has not announced GPT-4o sunsetting timelines, but the 2026 trend favors migration to GPT-4.1 Mini for cost-equivalent alternatives. GPT-4o Mini at $0.15/$0.60 also holds steady.

GPT-4o Pricing Update Tracker (2026)

OpenAI pricing and model availability evolves throughout the year. We track every update so this page stays the most current source. Last reviewed: April 2026.

April 2026: GPT-4o legacy pricing holds at $2.50/$10. GPT-4o Mini holds at $0.15/$0.60. No sunsetting timeline announced for existing users.
January 2026: GPT-4o moved to grandfathered legacy status with the GPT-4.1 family launch. Existing GPT-4o integrations continue at legacy rates; new API users default to GPT-4.1.
Q4 2025: GPT-4o pricing held steady through end of year. Prompt caching expanded across GPT-4 family models with up to 90% discount on cached input.
Q3 2025: Batch API discount confirmed at 50% across all OpenAI models including GPT-4o.