Gemini API April 2026 Update: Flex/Priority Tiers, Deep Research with MCP, embedding-2 GA, gemini-3.1-flash-tts-preview
April 2026 in the Gemini API: new Flex and Priority inference tiers (Apr 1), gemini-3.1-flash-tts-preview (Apr 15), Deep Research updates with MCP server integration and File Search (Apr 21), gemini-embedding-2 GA (Apr 22), and the gemini-robotics-er-1.5-preview shutdown (Apr 30).
Editorial Team
The AI Coding Tools Directory editorial team researches and reviews AI-powered development tools to help developers find the best solutions for their workflows.
April 2026 was a substantive month for the Gemini API, with five changes worth pulling forward: Flex and Priority inference tiers (Apr 1), gemini-3.1-flash-tts-preview (Apr 15), a major Deep Research agent update with MCP server integration (Apr 21), gemini-embedding-2 GA (Apr 22), and the gemini-robotics-er-1.5-preview shutdown (Apr 30) in favor of 1.6-preview. Three of these — tiers, Deep Research + MCP, and embedding-2 — directly touch developer-tooling workflows.
TL;DR
- Apr 1 — Flex and Priority inference tiers introduced; pick per request to trade off cost vs latency.
- Apr 15 —
gemini-3.1-flash-tts-previewreleased: cost-efficient, expressive, steerable TTS.- Apr 21 — Deep Research agent gains collaborative planning, visualization, MCP server integration, and File Search; new model variants
deep-research-preview-04-2026(speed/streaming) anddeep-research-max-preview-04-2026(max research).- Apr 22 —
gemini-embedding-2reaches GA.- Apr 30 —
gemini-robotics-er-1.5-previewshut down; migrate togemini-robotics-er-1.6-preview.
Quick Answer
If you ship a Gemini-backed app, three of these are immediately actionable. (1) Flex / Priority lets you cost-tune per request without code surgery — useful if you have mixed batch and interactive traffic on the same key. (2) Deep Research + MCP integration means an agent can pull from your MCP-described tools natively inside the Gemini Deep Research workflow — a real lift for builders already on the MCP ecosystem. (3) gemini-embedding-2 GA means you can move embedding workloads off embedding-1 without preview-status caveats.
Flex and Priority Inference Tiers (Apr 1, 2026)
Google introduced Flex and Priority inference tiers across Gemini API on April 1. The model is the same; what changes is the serving SLA and the price point.
| Tier | When to use |
|---|---|
| Flex | Background batch work, async pipelines, evals, cost-sensitive workloads where some queueing is acceptable. |
| Priority | Interactive UX, real-time agents, latency-bound chat or coding sessions where p99 matters. |
| Standard | Existing default behavior. |
Per-tier pricing and exact latency targets are surfaced in the official changelog. The shape of this is similar to what AWS Bedrock has been doing with X-Amzn-Bedrock-Service-Tier — a sign the whole industry is converging on per-request tiering rather than only per-model price.
gemini-3.1-flash-tts-preview (Apr 15, 2026)
A new TTS preview rooted in the Gemini 3.1 Flash family. Google's framing: cost-efficient, expressive, and steerable. As a preview model:
- Expect the API surface to evolve before GA.
- Good fit for prototyping voice features, narrating agent output, or building accessibility flows on top of Gemini-driven coding tools.
- Avoid for compliance-bound production until it ships GA — preview models can change input/output shape without notice.
Deep Research Agent: Collaborative Planning, Visualization, MCP, File Search (Apr 21, 2026)
The headline coding-agent change of the month. The Deep Research agent picked up four capabilities:
- Collaborative planning — multi-step plans that incorporate user feedback mid-flight rather than running to completion blind.
- Visualization support — research outputs include rendered visualizations, not just text.
- MCP server integration — Deep Research can call tools exposed via Model Context Protocol servers, the same standard surfaced in Claude Code and Cursor.
- File Search — first-party document grounding inside the agent loop.
Two new model variants ship alongside:
The AI-native code editor with $1B+ ARR, 25+ models, and background agents on dedicated VMs
Anthropic's terminal-based AI coding agent with Claude Opus 4.7, /ultrareview, Routines, /ultraplan, and 80.9% SWE-bench
deep-research-preview-04-2026— optimized for speed and client-side streaming. Use for interactive UX where you want the research to feel live.deep-research-max-preview-04-2026— optimized for comprehensive automated research. Use when wall-clock time matters less than depth.
The MCP move is the structurally important one. Three of the major coding-agent stacks — Claude Code, Cursor, and now Gemini Deep Research — speak MCP, which means an MCP server you build (for your internal docs, your CI, your ticketing) becomes a portable plugin across vendors. We covered the MCP surface in detail in the complete guide to MCP servers.
gemini-embedding-2 Reaches GA (Apr 22, 2026)
gemini-embedding-2 graduated from preview to general availability on April 22, 2026. For developer workflows that depend on embeddings — RAG over codebases, semantic code search, doc-grounded chat — this lifts the production caveat that comes with preview models. If you've been holding embedding-1 traffic in production while testing -2 in staging, you can now flip the switch.
gemini-robotics-er-1.5-preview Shutdown (Apr 30, 2026)
Less relevant for AI coding tools, but worth noting for completeness: gemini-robotics-er-1.5-preview was retired on April 30, 2026. Migrate to gemini-robotics-er-1.6-preview, which is the supported successor.
How This Changes the Vendor Picture
| Capability | Gemini API (April 2026) | OpenAI Codex / GPT-5.5 | Anthropic / Claude Opus 4.7 |
|---|---|---|---|
| Per-request inference tiers | Flex / Priority (Apr 1) | Service-tier headers (existing) | Service-tier on Bedrock (ANTHROPIC_BEDROCK_SERVICE_TIER) |
| Native MCP integration in agent stack | Deep Research (Apr 21) | Codex CLI (existing) | Claude Code (native) |
| GA-grade embeddings model | gemini-embedding-2 (Apr 22) |
text-embedding-3-large |
n/a |
| Latest research/agent-research mode | deep-research-preview-04-2026 and -max-preview-04-2026 |
Deep Research GPT-5.5 | n/a |
For frontier-coding model context, see our OpenAI Codex GPT-5.5 update and Claude Opus 4.7 release post.
Cloud coding agent with GPT-5.5 frontier model, 1M+ developers, Desktop App, in-app browser use, and parallel sandboxed environments
Sources
- Google AI for Developers — Gemini API changelog: ai.google.dev/gemini-api/docs/changelog
- Gemini 3.1 Flash coding context: /blog/gemini-3-flash-coding-guide
- MCP background: /blog/complete-guide-mcp-servers
For broader landscape context see our gpt-5-1-codex-max-vs-gemini-3-pro">Opus 4.5 vs GPT-5.1 Codex Max vs Gemini 3 Pro comparison and the competitive landscape of AI software development.
Tools Mentioned in This Article
Claude Code
Anthropic's terminal-based AI coding agent with Claude Opus 4.7, /ultrareview, Routines, /ultraplan, and 80.9% SWE-bench
SubscriptionCursor
The AI-native code editor with $1B+ ARR, 25+ models, and background agents on dedicated VMs
FreemiumGPT-5
OpenAI's first unified reasoning model: 70.1% SWE-bench, 400K context, and $1.25/$10 per MTok
Pay-per-useOpenAI Codex
Cloud coding agent with GPT-5.5 frontier model, 1M+ developers, Desktop App, in-app browser use, and parallel sandboxed environments
FreemiumZed
High-performance Rust code editor with agentic AI and open-source edit prediction
FreemiumFree Resource
2026 AI Coding Tools Comparison Chart
Side-by-side comparison of features, pricing, and capabilities for every major AI coding tool.
No spam, unsubscribe anytime.
Workflow Resources
Cookbook
AI-Powered Code Review & Quality
Automate code review and enforce quality standards using AI-powered tools and agentic workflows.
Cookbook
Building AI-Powered Applications
Build applications powered by LLMs, RAG, and AI agents using Claude Code, Cursor, and modern AI frameworks.
Cookbook
Building APIs & Backends with AI Agents
Design and build robust APIs and backend services with AI coding agents, from REST to GraphQL.
Cookbook
Debugging with AI Agents
Systematically debug complex issues using AI coding agents with structured workflows and MCP integrations.
MCP Server
AWS MCP Server
Interact with AWS services including S3, Lambda, CloudWatch, and ECS from your AI coding assistant.
MCP Server
Context7 MCP Server
Fetch up-to-date library documentation and code examples directly into your AI coding assistant.
MCP Server
Docker MCP Server
Manage Docker containers, images, and builds directly from your AI coding assistant.
MCP Server
Figma MCP Server
Access Figma designs, extract design tokens, and generate code from your design files.
Frequently Asked Questions
What are the new Flex and Priority inference tiers?
What changed for the Deep Research agent in April 2026?
Is gemini-embedding-2 generally available now?
What is gemini-3.1-flash-tts-preview?
What happened to gemini-robotics-er-1.5-preview?
Related Articles
Cursor SDK (April 29, 2026): Build Programmatic Agents with the Same Runtime That Powers Cursor
Cursor launched a TypeScript SDK on April 29, 2026 that exposes the same agent runtime, harness, and models that power the Cursor IDE. Run agents locally or on Cursor's cloud, integrate them into your own apps, and reach for any frontier model behind a single interface.
Read more →NewsClaude Code v2.1.120 → v2.1.123 (April 28–29, 2026): claude ultrareview in CI, Windows Without Git Bash, MCP alwaysLoad, plugin prune
Claude Code's late-April 2026 point releases (v2.1.120–v2.1.123) make /ultrareview runnable from CI as a non-interactive subcommand, drop the Git Bash requirement on Windows in favor of PowerShell, add an alwaysLoad option for MCP servers, and ship claude plugin prune for orphaned plugin dependencies.
Read more →NewsWarp Goes Open Source (April 28, 2026): AGPL Client, OpenAI as Founding Sponsor, Oz-Driven Contributions
Warp open-sourced the Warp client under AGPL on April 28, 2026 at github.com/warpdotdev/warp, with OpenAI as the founding sponsor, expanded support for Kimi/MiniMax/Qwen plus an 'auto (open)' router, and a new Oz-agent-driven contribution workflow.
Read more →