Gemini API Free Tier (April 2026): Every Limit, Quota, and Gotcha
Google gives away more free LLM compute than any other major provider. No credit card. No expiration. Access to their best models. But the limits are specific and the differences between Google AI Studio and Vertex AI matter more than most guides explain. Here is everything you need to know about what the Gemini free tier includes, where the walls are, and when it makes sense to start paying.
Two Ways to Access Gemini for Free
Google offers Gemini through two platforms, each with different free tier mechanics. Understanding the distinction is important because they serve different use cases, and mixing them up leads to unexpected bills or unnecessary limitations.
Google AI Studio (aistudio.google.com)
This is the primary free tier path. You get an API key with zero configuration. No Google Cloud project. No billing account. It works immediately after signing in with a Google account.
AI Studio gives you access to every current Gemini model: 2.5 Flash, 2.5 Pro (experimental), 2.0 Flash, and 1.5 Pro. Each model has different rate limits on the free tier, and we cover those in detail below. The key advantage here is simplicity. You get a key, you make API calls, and you pay nothing until you choose to upgrade.
The tradeoff: Google may use your free-tier inputs and outputs to improve their models. If you are building something with sensitive data, this matters.
Vertex AI ($300 Credit for New Accounts)
Vertex AI is Google Cloud's full ML platform. New GCP accounts get $300 in free credits valid for 90 days. That covers Gemini API calls, but also any other Google Cloud service you use during that window.
After the credits expire, you pay standard rates. There is no ongoing free tier for Vertex AI. The advantage over AI Studio is higher rate limits, no data-sharing clause, enterprise features (grounding, custom fine-tuning, data residency), and SLAs. For most developers starting out, AI Studio is the right choice. Vertex AI makes sense when you need production guarantees or are already in the Google Cloud ecosystem.
Free Tier Rate Limits by Model
These are the limits that actually matter day-to-day. Every model has three constraints: requests per minute (RPM), tokens per minute (TPM), and requests per day (RPD). You will hit the RPM limit long before TPM in most use cases.
| Model | RPM | TPM | RPD | Context |
|---|---|---|---|---|
| Gemini 2.5 Flash | 10 | 250,000 | 1,500 | 1M tokens |
| Gemini 2.5 Pro | 5 | 150,000 | 50 | 1M tokens |
| Gemini 2.0 Flash | 15 | 1,000,000 | 1,500 | 1M tokens |
| Gemini 1.5 Pro | 2 | 32,000 | 50 | 2M tokens |
| Gemini 1.5 Flash | 15 | 1,000,000 | 1,500 | 1M tokens |
| Gemma 2 (27B) | 15 | 1,000,000 | 1,500 | 8K tokens |
A few things stand out. Gemini 2.5 Pro is capped at 50 requests per day on the free tier. That is enough to test and prototype but not enough for any real workload. Gemini 2.0 Flash is the most generous: 15 RPM and 1,500 RPD means you could run a small production app entirely on the free tier if your traffic stays under about 1 request per minute on average.
What the Free Tier Includes Beyond Text
Gemini is multimodal, and the free tier includes all modalities. This is a significant differentiator from OpenAI and Anthropic, where vision capabilities are often limited or cost extra.
- Image understanding: Send images alongside text prompts. Useful for OCR, chart reading, visual Q&A, and image classification. No separate image API needed.
- Video understanding: Upload video clips up to 2 hours in length (with Gemini 1.5 Pro or newer). The model can answer questions about video content, extract information from frames, and summarize visual sequences.
- Audio processing: Send audio files for transcription, translation, and content understanding. Supports common formats including MP3, WAV, and FLAC.
- Document analysis: Upload PDFs directly. The model processes text and images within the document without needing a separate parsing step.
- Code execution: Gemini 2.0 Flash and 2.5 models can execute Python code in a sandboxed environment and return results. This runs inside Google's infrastructure at no additional cost on the free tier.
The rate limits above apply across all modalities. A multimodal request (text + image) counts as one request toward your RPM and RPD limits. Token counts for images and video are calculated based on resolution and duration, which can consume your TPM faster than text-only requests.
Gemini App vs. Gemini API: Different Products
This is where people get confused. The Gemini app (gemini.google.com) and the Gemini API are separate products with separate free tiers.
Gemini app (free): Unlimited text conversations with Gemini. Includes image generation (via Imagen 3). No API access. Think of this as Google's equivalent to free ChatGPT. It uses Gemini 2.0 Flash by default.
Gemini Advanced ($19.99/month via Google One AI Premium): Access to Gemini 2.5 Pro in the app. 2TB Google storage. Gemini in Gmail, Docs, Sheets, and other Workspace apps. NotebookLM Plus. Still no API access included.
Gemini API (Google AI Studio free tier): What this guide covers. Programmatic access to all Gemini models. Completely separate from the Gemini app subscription. You can use both the free app and the free API simultaneously.
A common mistake: people pay for Gemini Advanced thinking it gives them API access. It does not. The API free tier through Google AI Studio is separate and does not require any subscription.
When the Free Tier Breaks Down
The 10 RPM limit on Gemini 2.5 Flash means you can process roughly 600 requests per hour. For a chatbot serving a handful of users, that is fine. For batch processing (analyzing a thousand documents, for example), you will need the paid tier or creative rate-limit management. At 1,500 requests per day, you run out by early afternoon if your app has steady traffic. The inflection point for most developers: once you are building something other people use, you will need to upgrade.
Free Tier vs. Pay-as-You-Go Pricing
When you do outgrow the free tier, here is what you will pay. These rates apply through Google AI Studio with a billing account enabled.
| Model | Input / 1M Tokens | Output / 1M Tokens | Paid RPM |
|---|---|---|---|
| Gemini 2.5 Flash | $0.15 | $0.60 | 2,000 |
| Gemini 2.5 Pro | $1.25 | $10.00 | 1,000 |
| Gemini 2.0 Flash | $0.075 | $0.30 | 2,000 |
| Gemini 1.5 Pro | $1.25 | $5.00 | 1,000 |
Gemini 2.0 Flash at $0.075 per million input tokens is the cheapest major LLM API on the market. For comparison, GPT-4o mini costs $0.15/$0.60 (twice the input price) and Claude Haiku 4.5 costs $1/$5 (over 13x the input price). If cost is your primary concern and you need a capable model, Gemini 2.0 Flash is difficult to beat even after you leave the free tier.
The jump from free to paid also increases your RPM by 100-200x. That alone justifies upgrading for any production workload.
Practical Tips for Maximizing the Free Tier
If you want to stay on the free tier as long as possible, here are strategies that work.
- Use Gemini 2.0 Flash for most tasks. It has the highest rate limits (15 RPM, 1M TPM) and is fast. Reserve 2.5 Flash or 2.5 Pro for tasks that genuinely need better reasoning.
- Cache results aggressively. If your app asks the same types of questions repeatedly, cache the API responses locally. A simple key-value cache (Redis, SQLite, even a dictionary) can eliminate 30-60% of redundant calls.
- Batch your requests within rate windows. Instead of making calls as they come in, queue requests and send them in controlled bursts that stay under 15 RPM.
- Use the Gemini app for manual testing. The web app has no rate limits for conversational use. Use it for prompt development and iteration, then switch to the API for production calls.
- Implement exponential backoff. When you hit rate limits, the API returns 429 errors. A simple retry with exponential backoff (1s, 2s, 4s, 8s) handles transient limit hits gracefully without losing requests.
How Gemini Free Tier Compares to Other Providers
Here is an honest comparison of what you get for free across the major AI API providers as of April 2026.
| Provider | Free Access | Expiration | Best Free Model |
|---|---|---|---|
| Google (Gemini) | Ongoing free tier, no card needed | Never | Gemini 2.0 Flash |
| OpenAI | $5-$18 credit on signup | 3 months | GPT-4o mini |
| Anthropic | $5 console credit | Limited | Claude Haiku 4.5 |
| Mistral | Free tier, API key only | Never | Mistral Small |
| Groq | Free tier with rate limits | Never | Llama 3.3 70B |
| Cohere | Trial key, 1,000 calls/month | Never | Command R+ |
Google's position is clear: they want developers building on Gemini, and they are willing to subsidize the onboarding. The combination of no expiration, no credit card, multimodal support, and access to competitive models makes the Gemini free tier the strongest starting point for developers in 2026.
For a detailed breakdown of all providers, see our complete AI API free tier comparison.
Common Mistakes to Avoid
After tracking developer experience with Gemini's free tier, these are the mistakes we see most often.
- Confusing AI Studio with Vertex AI. They have different URLs, different authentication, different rate limits, and different billing. Pick one and stick with it.
- Not handling 429 errors. The free tier will rate-limit you. If your code does not have retry logic, requests will silently fail during traffic spikes.
- Sending unnecessarily large prompts. Long system prompts eat into your TPM budget. Keep prompts concise. Use few-shot examples only when they measurably improve output quality.
- Ignoring the data-sharing clause. If you send customer data, PII, or proprietary information through the free tier, Google can use it for training. For production apps with real user data, upgrade to paid or use Vertex AI.
- Using 2.5 Pro when 2.0 Flash would work. The Pro model has a 50 RPD limit on free tier. Most tasks that developers throw at Pro can be handled by Flash with a better prompt. Test with Flash first.
Related Resources
Frequently Asked Questions
Is the Gemini API free to use in 2026?
Yes. Google AI Studio offers a free tier for Gemini models with no credit card required. You get access to Gemini 2.5 Flash, 2.0 Flash, and 1.5 Pro with rate limits of 10-30 requests per minute depending on the model. The free tier is generous enough for prototyping, personal projects, and low-volume production apps.
What are the rate limits on Gemini's free tier?
Gemini 2.5 Flash gets 10 RPM and 250,000 tokens per minute. Gemini 2.0 Flash gets 15 RPM and 1,000,000 tokens per minute. Gemini 2.5 Pro gets only 5 RPM and 50 requests per day. Daily request caps of 1,500 RPD apply for Flash models.
What is the difference between Google AI Studio and Vertex AI?
Google AI Studio is the free, developer-friendly way to access Gemini with an API key. No GCP project needed. Vertex AI is Google Cloud's enterprise ML platform, which requires a GCP project and billing account. Vertex AI offers $300 in free credits for new accounts, higher rate limits, SLAs, and features like grounding with Google Search. For individual developers, AI Studio's free tier is the better starting point.
Does the Gemini free tier include image and video understanding?
Yes. All Gemini models on the free tier support multimodal inputs including images, audio, video, and documents. Multimodal requests count toward the same rate limits as text requests. Token counts for images and video are calculated based on resolution and duration.
When should I upgrade from Gemini's free tier?
Upgrade when you consistently hit rate limits (more than 10-15 RPM), need guaranteed uptime for production apps, want higher throughput for batch processing, or need enterprise features like data residency. Pay-as-you-go starts at $0.075 per 1M input tokens for Gemini 2.0 Flash.
How does Gemini's free tier compare to OpenAI and Anthropic?
Google is the most generous by far. OpenAI gives new accounts $5-$18 in credits that expire after 3 months. Anthropic offers $5 in console credits with no ongoing free tier. Google AI Studio has no expiration and no credit card requirement, making it the only major provider with a true indefinite free API tier.
Can I use the Gemini free tier for commercial applications?
Yes, with a caveat. Google AI Studio's free tier allows commercial use, but Google may use free-tier inputs and outputs to improve its models. If data privacy matters for your application, upgrade to paid or use Vertex AI, which does not use your data for model training.