GPT-4o Pricing: What It Costs Per Token (April 2026)

GPT-4o launched in May 2024 at $5/$15 per million tokens and got a 50% price cut in October 2024. It now costs $2.50 input and $10 output per million tokens. OpenAI has since released GPT-4.1 as the recommended production replacement, but GPT-4o remains available and widely used. If you are running GPT-4o in production or evaluating whether to switch, this page covers everything: current per-token pricing, batch and caching discounts, comparison to GPT-4.1, and real-world cost estimates.

GPT-4o

$2.50 / $10.00 per 1M input / output tokens
  • Multimodal: text, image, and audio input
  • 128K context window
  • Function calling and structured outputs
  • Vision capabilities included at no extra charge
  • 50% cached input discount ($1.25/1M)
Most Popular

GPT-4o Mini

$0.15 / $0.60 per 1M input / output tokens
  • Cheapest GPT-4o variant
  • 128K context window
  • Good for classification, extraction, simple tasks
  • Function calling supported
  • 50% cached input discount ($0.075/1M)

GPT-4o Audio

$40.00 / $80.00 per 1M audio tokens
  • Native audio input and output
  • Text tokens still at standard GPT-4o rates
  • Audio tokens are significantly more expensive
  • Best for voice-first applications

GPT-4o (Batch)

$1.25 / $5.00 per 1M input / output tokens
  • 50% discount on standard pricing
  • 24-hour turnaround SLA
  • Same quality as synchronous API
  • Best for bulk processing and evaluation

GPT-4o Pricing Table (April 2026)

GPT-4o pricing comparison showing input and output costs versus GPT-4.1 and mini models
GPT-4o pricing comparison

Here is every GPT-4o variant with current per-token pricing. All prices are per 1 million tokens unless noted otherwise.

ModelInputCached InputOutputContext
GPT-4o$2.50$1.25$10.00128K
GPT-4o Mini$0.15$0.075$0.60128K
GPT-4o Audio (text)$2.50$1.25$10.00128K
GPT-4o Audio (audio)$40.00-$80.00128K
GPT-4o (Batch)$1.25-$5.00128K
GPT-4o Mini (Batch)$0.075-$0.30128K

GPT-4o Price History

GPT-4o launched in May 2024 at $5.00 input and $15.00 output per million tokens. In October 2024, OpenAI cut prices by 50% to the current $2.50/$10.00. GPT-4o Mini launched in July 2024 at $0.15/$0.60 and has not changed since. The price cuts made GPT-4o competitive with Claude 3.5 Sonnet at the time, though both providers have since released newer models.

DateModelInput/1MOutput/1MChange
May 2024GPT-4o$5.00$15.00Launch price
Jul 2024GPT-4o Mini$0.15$0.60New model
Oct 2024GPT-4o$2.50$10.00-50%
Jan 2025GPT-4.1$2.00$8.00New (replaced 4o)

GPT-4o vs GPT-4.1: Which Should You Use?

GPT-4.1 replaced GPT-4o as OpenAI's recommended production model in January 2025. It is cheaper ($2.00/$8.00 vs $2.50/$10.00), scores higher on coding benchmarks (SWE-bench), handles instruction following better, and supports a 1 million token context window versus GPT-4o's 128K. The main reason to stay on GPT-4o is vision support. GPT-4.1 does not accept image inputs. For text-only applications, GPT-4.1 is strictly better on price and performance. The API interface is identical, so migration requires only changing the model parameter from gpt-4o to gpt-4.1.

FeatureGPT-4oGPT-4.1
Input price$2.50/1M$2.00/1M
Output price$10.00/1M$8.00/1M
Context window128K tokens1M tokens
Vision supportYesNo
Recommended forMultimodalProduction text
Batch discount50%50%
Cache discount50%75%

GPT-4o vs GPT-4o Mini: When to Use Each

GPT-4o Mini costs 94% less than GPT-4o on input tokens and 94% less on output tokens. For tasks that do not require GPT-4o's full capabilities, classification, entity extraction, simple Q&A, content routing. Mini delivers comparable results at a fraction of the cost. The quality gap shows up on complex reasoning, nuanced writing, and multi-step tool use. A common pattern is to use Mini for initial filtering or classification and route only complex requests to the full GPT-4o or GPT-4.1 model. This hybrid approach can cut API costs by 80% or more depending on your traffic mix.

Batch API: Half-Price GPT-4o

OpenAI's Batch API processes requests asynchronously at 50% off. GPT-4o Batch pricing is $1.25 input and $5.00 output per million tokens. Jobs complete within 24 hours. The Batch API accepts the same request format as the synchronous Chat Completions API, making integration straightforward. Best for: bulk content generation, large-scale data labeling, evaluation pipelines, and any workload where latency is not critical. You submit a JSONL file of requests and poll for results.

ModelSync InputBatch InputSync OutputBatch Output
GPT-4o$2.50$1.25$10.00$5.00
GPT-4o Mini$0.15$0.075$0.60$0.30

Prompt Caching and How It Reduces GPT-4o Costs

OpenAI automatically caches the prefix of your prompt. When a subsequent request starts with the same tokens, cached input tokens are billed at 50% off ($1.25/1M instead of $2.50/1M for GPT-4o). Caching works best with long, stable system prompts and few-shot examples. The cache has a short TTL, typically a few minutes, so it helps most with high-throughput applications making frequent similar requests. Structure your prompts with static content first and dynamic content last to maximize cache hits.

Real-World Cost Examples

Here is what GPT-4o actually costs for common use cases, assuming average token counts per request.

Use CaseInput TokensOutput TokensCost per RequestCost per 1K Requests
Chatbot reply~500~200$0.0033$3.25
Document summary (5 pages)~3,000~500$0.0125$12.50
Code generation~1,000~800$0.0105$10.50
Image analysis~1,500~300$0.0068$6.75
RAG with 10K context~10,000~500$0.0300$30.00

Rate Limits by Spending Tier

OpenAI sets rate limits based on your total API spending. New accounts start at the free tier with modest limits. As you spend more, limits increase automatically. For GPT-4o specifically:

TierRequirementRPMTPMRPD
FreeVerified account50030K500
Tier 1$5+ spent50030K10K
Tier 2$50+ spent5K450K
Tier 3$100+ spent5K800K
Tier 4$250+ spent10K2M
Tier 5$1,000+ spent10K30M

How to Estimate Your Monthly GPT-4o Bill

Multiply your average input tokens per request by your request volume, then divide by 1 million and multiply by $2.50. Do the same for output tokens at $10.00 per million. Example: 10,000 requests/day averaging 1,000 input and 300 output tokens each = 10M input tokens + 3M output tokens per day. Monthly: 300M input ($750) + 90M output ($900) = $1,650/month at standard rates, or $825/month with Batch API. Add caching if your prompts have stable prefixes.

GPT-4o vs Claude Sonnet 4.6: Cross-Provider Comparison

Anthropic's Claude Sonnet 4.6 is the closest competitor to GPT-4o. Claude Sonnet 4.6 costs $3.00 input and $15.00 output per million tokens, more expensive than GPT-4o on both dimensions. Claude offers a 200K context window versus GPT-4o's 128K. Performance is comparable on most benchmarks, with Claude generally stronger on long-form writing and analysis, and GPT-4o stronger on structured output and tool use. For cost-sensitive applications, GPT-4o (or better yet, GPT-4.1) has the edge.

ModelInput/1MOutput/1MContextVision
GPT-4o$2.50$10.00128KYes
GPT-4.1$2.00$8.001MNo
Claude Sonnet 4.6$3.00$15.00200KYes
Claude Haiku 4.5$0.80$4.00200KYes
Gemini 2.0 Flash$0.10$0.401MYes

Migration Guide: GPT-4o to GPT-4.1

If you are running GPT-4o for text-only applications, switching to GPT-4.1 saves 20% on every API call with no quality loss. The migration is simple: change the model parameter from "gpt-4o" to "gpt-4.1" in your API calls. Both models use the same Chat Completions API format, support the same tools (function calling, structured outputs, JSON mode), and accept the same parameters. Test with a small sample first to verify output quality meets your requirements, then switch over. The only blocker is if you rely on vision input. GPT-4.1 does not support images.

Hidden Costs & Gotchas

  • {'title': 'GPT-4.1 is cheaper for most use cases', 'detail': "GPT-4.1 costs $2.00/$8.00 per 1M tokens versus GPT-4o's $2.50/$10.00. That's 20% cheaper on input and output. Unless you need vision or audio specifically, GPT-4.1 is the better deal with stronger benchmark performance."}
  • {'title': 'Output tokens cost 4x input tokens', 'detail': 'GPT-4o charges $2.50 per 1M input tokens but $10.00 per 1M output tokens. Applications that generate long responses (summaries, articles, code) will see output dominate the bill. Budget based on your output-to-input ratio.'}
  • {'title': 'Vision queries use image tokens', 'detail': 'Sending images to GPT-4o converts them to tokens based on resolution. A 1024x1024 image uses roughly 765 tokens. High-res mode can use 2-3x more. Vision is included in the base price, but the token count adds up fast.'}
  • {'title': 'Audio tokens are 16x more expensive', 'detail': "GPT-4o audio input costs $40 per 1M tokens and output costs $80 per 1M tokens. That's 16x and 8x the text token prices respectively. For voice applications, calculate audio token costs separately."}
  • {'title': 'Prompt caching only helps on repeated prefixes', 'detail': 'The 50% cached input discount applies only when the beginning of your prompt matches a recent request exactly. System prompts and few-shot examples benefit most. Dynamic user inputs typically do not get cached.'}
  • {'title': 'Rate limits depend on your spending tier', 'detail': "Free tier users get 500 RPM. Tier 1 (\\ spent) gets 500 RPM and 30K TPM. Tier 5 (\\+ spent) gets 10K RPM and 30M TPM. Your effective throughput depends on your account's spending history."}

Which Plan Do You Need?

Prototyping or light usage

GPT-4o Mini at $0.15/$0.60 per 1M tokens. 97% cheaper than standard GPT-4o for simple tasks. Use this for classification, extraction, routing, and early-stage development.

Production text applications

Migrate to GPT-4.1 ($2.00/$8.00). It's 20% cheaper than GPT-4o, benchmarks higher on coding and instruction following, and has a 1M token context window. GPT-4o is no longer the recommended production model.

Vision and multimodal workflows

GPT-4o is still the go-to for image understanding. GPT-4.1 does not support vision input. If your application sends images, stay on GPT-4o or evaluate o4-mini for reasoning-heavy vision tasks.

Bulk processing and evaluation

Batch API at $1.25/$5.00 per 1M tokens. Half the cost of synchronous GPT-4o, with 24-hour turnaround. Ideal for data labeling, content generation, and test suite evaluation.

The Bottom Line

GPT-4o at $2.50/$10.00 per million tokens is no longer OpenAI's best value for text-only applications. GPT-4.1 is 20% cheaper and performs better on most benchmarks. But GPT-4o remains the right choice for vision and multimodal workflows where GPT-4.1 has no support. If you're starting fresh, default to GPT-4.1 for text and GPT-4o for images. If you're already running GPT-4o, the migration is straightforward since they share the same API format.

Disclosure: Pricing information is sourced from official websites and may change. We update this page regularly but always verify current pricing on the vendor's site before purchasing.

Related Resources

Full OpenAI API pricing for all models → OpenAI API review and features → OpenAI vs Anthropic API pricing comparison → Best LLM frameworks for building with GPT-4o → Anthropic API pricing (Claude models) →

Frequently Asked Questions

How much does GPT-4o cost per token?

GPT-4o costs $2.50 per 1 million input tokens and $10.00 per 1 million output tokens. With prompt caching, input tokens drop to $1.25 per million. Batch API pricing is $1.25 input and $5.00 output per million tokens.

Is GPT-4o cheaper than GPT-4?

Yes. GPT-4o is significantly cheaper than the original GPT-4 ($30/$60 per 1M tokens) and GPT-4 Turbo ($10/$30 per 1M tokens). GPT-4o costs $2.50/$10.00, roughly 75% cheaper than GPT-4 Turbo.

Should I use GPT-4o or GPT-4.1?

Use GPT-4.1 for text-only applications, it is 20% cheaper and benchmarks higher. Use GPT-4o only if you need vision (image input) or audio capabilities, which GPT-4.1 does not support.

What is GPT-4o Mini and when should I use it?

GPT-4o Mini costs $0.15/$0.60 per million tokens, 94% cheaper than GPT-4o. Use it for simple tasks like classification, extraction, routing, and early-stage development. Switch to GPT-4o or GPT-4.1 for complex reasoning and nuanced generation.

Does GPT-4o have a free tier?

The OpenAI API does not have a free tier for GPT-4o. New accounts get $5 in credits. ChatGPT Free includes limited GPT-4o access, but that is the consumer product, not the API.

How does GPT-4o batch pricing work?

Submit a JSONL file of requests to the Batch API endpoint. Processing completes within 24 hours. Pricing is 50% off standard rates: $1.25 input and $5.00 output per million tokens. Same quality as synchronous calls.

Is GPT-4o being deprecated?

OpenAI has not announced a deprecation date for GPT-4o. It is no longer the recommended production model (GPT-4.1 replaced it), but it remains available. OpenAI typically gives at least 6 months notice before deprecating models.

How much does it cost to analyze an image with GPT-4o?

Image analysis uses GPT-4o's standard text token pricing plus additional tokens for the image. A 1024x1024 image uses roughly 765 tokens ($0.002). High-res images use more tokens. There is no separate per-image charge.

What is the context window for GPT-4o?

GPT-4o supports 128K tokens of context (roughly 96K words). Both GPT-4o and GPT-4o Mini share this 128K limit. If you need more context, GPT-4.1 supports 1 million tokens.

How does GPT-4o pricing compare to Claude?

GPT-4o ($2.50/$10.00) is cheaper than Claude Sonnet 4.6 ($3.00/$15.00) on both input and output. Claude Haiku 4.5 ($0.80/$4.00) is cheaper than GPT-4o but more expensive than GPT-4o Mini ($0.15/$0.60).

AI coding tools move fast

Weekly data on which tools developers are actually adopting, pricing changes, and new releases worth knowing about.