News

Gemini API April 2026 Update: Flex/Priority Tiers, Deep Research with MCP, embedding-2 GA, gemini-3.1-flash-tts-preview

April 2026 in the Gemini API: new Flex and Priority inference tiers (Apr 1), gemini-3.1-flash-tts-preview (Apr 15), Deep Research updates with MCP server integration and File Search (Apr 21), gemini-embedding-2 GA (Apr 22), and the gemini-robotics-er-1.5-preview shutdown (Apr 30).

By AI Coding Tools Directory•2026-04-30•5 min read

Last reviewed: 2026-04-30

ACTD

AI Coding Tools Directory

Editorial Team

The AI Coding Tools Directory editorial team researches and reviews AI-powered development tools to help developers find the best solutions for their workflows.

April 2026 was a substantive month for the Gemini API, with five changes worth pulling forward: Flex and Priority inference tiers (Apr 1), gemini-3.1-flash-tts-preview (Apr 15), a major Deep Research agent update with MCP server integration (Apr 21), gemini-embedding-2 GA (Apr 22), and the gemini-robotics-er-1.5-preview shutdown (Apr 30) in favor of 1.6-preview. Three of these — tiers, Deep Research + MCP, and embedding-2 — directly touch developer-tooling workflows.

TL;DR

Apr 1 — Flex and Priority inference tiers introduced; pick per request to trade off cost vs latency.

Apr 15 — gemini-3.1-flash-tts-preview released: cost-efficient, expressive, steerable TTS.

Apr 21 — Deep Research agent gains collaborative planning, visualization, MCP server integration, and File Search; new model variants deep-research-preview-04-2026 (speed/streaming) and deep-research-max-preview-04-2026 (max research).

Apr 22 — gemini-embedding-2 reaches GA.

Apr 30 — gemini-robotics-er-1.5-preview shut down; migrate to gemini-robotics-er-1.6-preview.

Quick Answer

If you ship a Gemini-backed app, three of these are immediately actionable. (1) Flex / Priority lets you cost-tune per request without code surgery — useful if you have mixed batch and interactive traffic on the same key. (2) Deep Research + MCP integration means an agent can pull from your MCP-described tools natively inside the Gemini Deep Research workflow — a real lift for builders already on the MCP ecosystem. (3) gemini-embedding-2 GA means you can move embedding workloads off embedding-1 without preview-status caveats.

Flex and Priority Inference Tiers (Apr 1, 2026)

Google introduced Flex and Priority inference tiers across Gemini API on April 1. The model is the same; what changes is the serving SLA and the price point.

Tier	When to use
Flex	Background batch work, async pipelines, evals, cost-sensitive workloads where some queueing is acceptable.
Priority	Interactive UX, real-time agents, latency-bound chat or coding sessions where p99 matters.
Standard	Existing default behavior.

Per-tier pricing and exact latency targets are surfaced in the official changelog. The shape of this is similar to what AWS Bedrock has been doing with X-Amzn-Bedrock-Service-Tier — a sign the whole industry is converging on per-request tiering rather than only per-model price.

`gemini-3.1-flash-tts-preview` (Apr 15, 2026)

A new TTS preview rooted in the Gemini 3.1 Flash family. Google's framing: cost-efficient, expressive, and steerable. As a preview model:

Expect the API surface to evolve before GA.
Good fit for prototyping voice features, narrating agent output, or building accessibility flows on top of Gemini-driven coding tools.
Avoid for compliance-bound production until it ships GA — preview models can change input/output shape without notice.

Deep Research Agent: Collaborative Planning, Visualization, MCP, File Search (Apr 21, 2026)

The headline coding-agent change of the month. The Deep Research agent picked up four capabilities:

Collaborative planning — multi-step plans that incorporate user feedback mid-flight rather than running to completion blind.
Visualization support — research outputs include rendered visualizations, not just text.
MCP server integration — Deep Research can call tools exposed via Model Context Protocol servers, the same standard surfaced in Claude Code and Cursor.
File Search — first-party document grounding inside the agent loop.

Two new model variants ship alongside:

CursorFreemium

The AI-native code editor with $1B+ ARR, 25+ models, and background agents on dedicated VMs

Read Review Try Cursor

Claude CodeSubscription

Anthropic's terminal-based AI coding agent with Claude Opus 4.7, /ultrareview, Routines, /ultraplan, and 80.9% SWE-bench

Read Review Try Claude Code

deep-research-preview-04-2026 — optimized for speed and client-side streaming. Use for interactive UX where you want the research to feel live.
deep-research-max-preview-04-2026 — optimized for comprehensive automated research. Use when wall-clock time matters less than depth.

The MCP move is the structurally important one. Three of the major coding-agent stacks — Claude Code, Cursor, and now Gemini Deep Research — speak MCP, which means an MCP server you build (for your internal docs, your CI, your ticketing) becomes a portable plugin across vendors. We covered the MCP surface in detail in the complete guide to MCP servers.

`gemini-embedding-2` Reaches GA (Apr 22, 2026)

gemini-embedding-2 graduated from preview to general availability on April 22, 2026. For developer workflows that depend on embeddings — RAG over codebases, semantic code search, doc-grounded chat — this lifts the production caveat that comes with preview models. If you've been holding embedding-1 traffic in production while testing -2 in staging, you can now flip the switch.

`gemini-robotics-er-1.5-preview` Shutdown (Apr 30, 2026)

Less relevant for AI coding tools, but worth noting for completeness: gemini-robotics-er-1.5-preview was retired on April 30, 2026. Migrate to gemini-robotics-er-1.6-preview, which is the supported successor.

How This Changes the Vendor Picture

Capability	Gemini API (April 2026)	OpenAI Codex / GPT-5.5	Anthropic / Claude Opus 4.7
Per-request inference tiers	Flex / Priority (Apr 1)	Service-tier headers (existing)	Service-tier on Bedrock (`ANTHROPIC_BEDROCK_SERVICE_TIER`)
Native MCP integration in agent stack	Deep Research (Apr 21)	Codex CLI (existing)	Claude Code (native)
GA-grade embeddings model	`gemini-embedding-2` (Apr 22)	`text-embedding-3-large`	n/a
Latest research/agent-research mode	`deep-research-preview-04-2026` and `-max-preview-04-2026`	Deep Research GPT-5.5	n/a

For frontier-coding model context, see our OpenAI Codex GPT-5.5 update and Claude Opus 4.7 release post.

OpenAI CodexFreemium

Cloud coding agent with GPT-5.5 frontier model, 1M+ developers, Desktop App, in-app browser use, and parallel sandboxed environments

Read Review Try OpenAI Codex

Sources

Google AI for Developers — Gemini API changelog: ai.google.dev/gemini-api/docs/changelog
Gemini 3.1 Flash coding context: /blog/gemini-3-flash-coding-guide
MCP background: /blog/complete-guide-mcp-servers

For broader landscape context see our gpt-5-1-codex-max-vs-gemini-3-pro">Opus 4.5 vs GPT-5.1 Codex Max vs Gemini 3 Pro comparison and the competitive landscape of AI software development.

Tools Mentioned in This Article

Claude Code

Anthropic's terminal-based AI coding agent with Claude Opus 4.7, /ultrareview, Routines, /ultraplan, and 80.9% SWE-bench

Subscription

→

Cursor

The AI-native code editor with $1B+ ARR, 25+ models, and background agents on dedicated VMs

Freemium

→

GPT-5

OpenAI's first unified reasoning model: 70.1% SWE-bench, 400K context, and $1.25/$10 per MTok

Pay-per-use

→

OpenAI Codex

Cloud coding agent with GPT-5.5 frontier model, 1M+ developers, Desktop App, in-app browser use, and parallel sandboxed environments

Freemium

→

Zed

High-performance Rust code editor with agentic AI and open-source edit prediction

Freemium

→

Free Resource

2026 AI Coding Tools Comparison Chart

Side-by-side comparison of features, pricing, and capabilities for every major AI coding tool.

No spam, unsubscribe anytime.

Workflow Resources

Cookbook

AI-Powered Code Review & Quality

Automate code review and enforce quality standards using AI-powered tools and agentic workflows.

Cookbook

Building AI-Powered Applications

Build applications powered by LLMs, RAG, and AI agents using Claude Code, Cursor, and modern AI frameworks.

Cookbook

Building APIs & Backends with AI Agents

Design and build robust APIs and backend services with AI coding agents, from REST to GraphQL.

Cookbook

Debugging with AI Agents

Systematically debug complex issues using AI coding agents with structured workflows and MCP integrations.

MCP Server

AWS MCP Server

Interact with AWS services including S3, Lambda, CloudWatch, and ECS from your AI coding assistant.

MCP Server

Context7 MCP Server

Fetch up-to-date library documentation and code examples directly into your AI coding assistant.

MCP Server

Docker MCP Server

Manage Docker containers, images, and builds directly from your AI coding assistant.

MCP Server

Figma MCP Server

Access Figma designs, extract design tokens, and generate code from your design files.

Frequently Asked Questions

What are the new Flex and Priority inference tiers?

Flex and Priority are new inference tiers Google launched April 1, 2026 for the Gemini API. They expand deployment options along the cost-versus-latency axis — Flex for cheaper, more elastic workloads; Priority for guaranteed low-latency serving. Pick a tier per request to match your workload's tolerance.

What changed for the Deep Research agent in April 2026?

On April 21, 2026 Deep Research received collaborative planning, visualization support, MCP server integration, and File Search. Two new model versions launched alongside: deep-research-preview-04-2026 (optimized for speed and client-side streaming) and deep-research-max-preview-04-2026 (designed for comprehensive automated research).

Is gemini-embedding-2 generally available now?

Yes. On April 22, 2026, gemini-embedding-2 graduated from preview to general availability. You can use it for production workloads with the standard GA support and stability commitments.

What is gemini-3.1-flash-tts-preview?

Released April 15, 2026, gemini-3.1-flash-tts-preview is a cost-efficient, expressive, and steerable text-to-speech model. As a preview, expect API surface evolution before GA — fine for prototypes, treat carefully for production.

What happened to gemini-robotics-er-1.5-preview?

It was shut down on April 30, 2026. Migrate to gemini-robotics-er-1.6-preview, which is the supported successor.

News

Cursor SDK (April 29, 2026): Build Programmatic Agents with the Same Runtime That Powers Cursor

Cursor launched a TypeScript SDK on April 29, 2026 that exposes the same agent runtime, harness, and models that power the Cursor IDE. Run agents locally or on Cursor's cloud, integrate them into your own apps, and reach for any frontier model behind a single interface.

Claude Code v2.1.120 → v2.1.123 (April 28–29, 2026): claude ultrareview in CI, Windows Without Git Bash, MCP alwaysLoad, plugin prune

Claude Code's late-April 2026 point releases (v2.1.120–v2.1.123) make /ultrareview runnable from CI as a non-interactive subcommand, drop the Git Bash requirement on Windows in favor of PowerShell, add an alwaysLoad option for MCP servers, and ship claude plugin prune for orphaned plugin dependencies.

Warp Goes Open Source (April 28, 2026): AGPL Client, OpenAI as Founding Sponsor, Oz-Driven Contributions

Warp open-sourced the Warp client under AGPL on April 28, 2026 at github.com/warpdotdev/warp, with OpenAI as the founding sponsor, expanded support for Kimi/MiniMax/Qwen plus an 'auto (open)' router, and a new Oz-agent-driven contribution workflow.

← Back to Blog

Quick Answer

Flex and Priority Inference Tiers (Apr 1, 2026)

gemini-3.1-flash-tts-preview (Apr 15, 2026)

Deep Research Agent: Collaborative Planning, Visualization, MCP, File Search (Apr 21, 2026)

gemini-embedding-2 Reaches GA (Apr 22, 2026)

gemini-robotics-er-1.5-preview Shutdown (Apr 30, 2026)

How This Changes the Vendor Picture

Sources

Tools Mentioned in This Article

Claude Code

Cursor

GPT-5

OpenAI Codex

Zed

2026 AI Coding Tools Comparison Chart

Workflow Resources

AI-Powered Code Review & Quality

Building AI-Powered Applications

Building APIs & Backends with AI Agents

Debugging with AI Agents

AWS MCP Server

Context7 MCP Server

Docker MCP Server

Figma MCP Server

Frequently Asked Questions

Related Articles

Cursor SDK (April 29, 2026): Build Programmatic Agents with the Same Runtime That Powers Cursor

Claude Code v2.1.120 → v2.1.123 (April 28–29, 2026): claude ultrareview in CI, Windows Without Git Bash, MCP alwaysLoad, plugin prune

Warp Goes Open Source (April 28, 2026): AGPL Client, OpenAI as Founding Sponsor, Oz-Driven Contributions

`gemini-3.1-flash-tts-preview` (Apr 15, 2026)

`gemini-embedding-2` Reaches GA (Apr 22, 2026)

`gemini-robotics-er-1.5-preview` Shutdown (Apr 30, 2026)