Cohere API Pricing: Every Model Compared (April 2026)

Cohere offers three product lines: Command (text generation), Embed (vector embeddings), and Rerank (search result reranking). Unlike OpenAI and Anthropic, Cohere has a meaningful free trial tier with 100 API calls per minute and 1,000 per month. The paid models are competitively priced. Command R+ at $2.50/$10 per million tokens is cheaper than Claude Sonnet 4.6 and GPT-4.1 for text generation. This page covers every model's pricing.

Trial (Free)

$0 Rate-limited

✓ 100 API calls per minute
✓ 1,000 calls per month
✓ Access to all models
✓ No credit card required
✓ Enough for prototyping and evaluation

Command R+

$2.50 / $10 per 1M input / output tokens

✓ Most capable text generation model
✓ Strong at RAG and tool use
✓ 128K context window
✓ Multilingual support (10+ languages)
✓ Cheaper than Sonnet 4.6 and GPT-4.1

Command R

$0.15 / $0.60 per 1M input / output tokens

✓ Budget text generation model
✓ Good for classification and extraction
✓ 128K context window
✓ Comparable to GPT-4.1 Nano on simple tasks
✓ very cost-effective at scale

Embed v3

$0.10 per 1M tokens

✓ High-quality text embeddings
✓ 1024 dimensions (configurable)
✓ Multilingual support
✓ Comparable to OpenAI text-embedding-3-small
✓ Top for search and RAG

Rerank 3.5

$2.00 per 1,000 searches

✓ Reranks up to 100 docs per search
✓ Dramatically improves RAG accuracy
✓ Works with any vector database
✓ Simple API, just send query + documents
✓ Documents >500 tokens split into chunks

Cohere vs OpenAI vs Anthropic: Pricing Comparison

Here's how Cohere's models stack up against the competition on price.

Use Case	Cohere	OpenAI	Anthropic
Budget generation	Command R: $0.15/$0.60	GPT-4.1 Nano: $0.10/$0.40	Haiku 4.5: $1/$5
Production generation	Command R+: $2.50/$10	GPT-4.1: $2/$8	Sonnet 4.6: $3/$15
Embeddings	Embed v3: $0.10/1M	text-embedding-3-small: $0.02/1M	—
Reranking	Rerank 3.5: $2/1K searches	—	—
Free tier	1,000 calls/month	Limited credits	Limited credits

Hidden Costs & Gotchas

⚠ Rerank charges per search, not per document. One search with 100 documents costs the same as one search with 10 documents (if all docs are under 500 tokens). But documents over 500 tokens get split into chunks, and each chunk counts separately.
⚠ Command R+ output tokens cost 4x input tokens. Long generative responses get expensive. Use Command R ($0.15/$0.60) for tasks that don't need R+ quality.
⚠ The free trial's 1,000 calls/month is generous for prototyping but not for production. There's no pay-as-you-go middle ground, you go from free to production pricing.
⚠ Embed v3 at $0.10/1M tokens is cheap, but embedding large document collections adds up. A million 500-word documents is roughly 650M tokens, costing $65 to embed.
⚠ Cohere's pricing is competitive with OpenAI and Anthropic, but the model quality gap matters. Command R+ is strong at RAG but may lag behind Sonnet 4.6 or GPT-4.1 on general reasoning and coding tasks.

Which Plan Do You Need?

RAG pipeline builder

Embed v3 ($0.10/1M tokens) for embeddings + Rerank 3.5 ($2/1K searches) for result quality. This combo is Cohere's strongest use case and competitive with any alternative.

Text generation at scale

Command R ($0.15/$0.60) for high-volume simple tasks. Command R+ ($2.50/$10) when you need stronger reasoning. Both are cheaper than Anthropic and OpenAI equivalents.

Enterprise with compliance needs

Cohere offers deployment on your own cloud (VPC) and on-premises options. Contact sales for enterprise pricing. This is a differentiator vs OpenAI and Anthropic, which are API-only.

The Bottom Line

Cohere's sweet spot is RAG pipelines. Embed v3 for creating embeddings plus Rerank 3.5 for improving search results is a top combination at competitive prices. For text generation, Command R+ at $2.50/$10 is cheaper than Sonnet 4.6 ($3/$15) and GPT-4.1 ($2/$8 on output), though model quality varies by task. The free trial with 1,000 calls/month is the most generous free tier among major API providers.

Disclosure: Pricing information is sourced from official websites and may change. We update this page regularly but always verify current pricing on the vendor's site before purchasing.

Related Resources

Anthropic API Pricing → OpenAI API Pricing → AWS Bedrock Pricing → Best Embedding Models →

Frequently Asked Questions

How much does Cohere cost?

It depends on the model. Command R+ (generation): $2.50/$10 per 1M tokens. Command R (cheaper generation): $0.15/$0.60 per 1M tokens. Embed v3: $0.10 per 1M tokens. Rerank v3: $2 per 1,000 searches. There's also a free trial tier with 1,000 calls/month.

Is Cohere cheaper than OpenAI?

For embeddings, yes. Cohere Embed v3 at $0.10/1M tokens is cheaper than OpenAI's text-embedding-3-small at $0.02/1M tokens for smaller models, but Cohere's multilingual quality is generally higher. For generation, Command R ($0.15/1M) is comparable to GPT-4o-mini ($0.15/1M). Command R+ ($2.50/1M) is similar to GPT-4o ($2.50/1M).

What is Cohere Rerank?

Rerank is a model that takes a query and a list of documents and re-orders them by relevance. It dramatically improves RAG accuracy by filtering out irrelevant retrieved documents before they reach your LLM. At $2 per 1,000 searches, it's one of the cheapest ways to improve retrieval quality.

Should I use Cohere or OpenAI for embeddings?

Cohere Embed v3 generally produces higher quality embeddings for multilingual and retrieval use cases. OpenAI's text-embedding-3 is simpler to integrate if you're already using the OpenAI API. For English-only, the quality difference is small. For multilingual, Cohere wins.

Does Cohere have a free tier?

Cohere has a trial tier that's free with 1,000 API calls per month and 100 per minute. It's limited to non-production use. There's no permanent free tier for production applications. You move to pay-as-you-go pricing once you're past evaluation.