News

Gemini API April 2026 Update: Flex/Priority Tiers, Deep Research with MCP, embedding-2 GA, gemini-3.1-flash-tts-preview

April 2026 in the Gemini API: new Flex and Priority inference tiers (Apr 1), gemini-3.1-flash-tts-preview (Apr 15), Deep Research updates with MCP server integration and File Search (Apr 21), gemini-embedding-2 GA (Apr 22), and the gemini-robotics-er-1.5-preview shutdown (Apr 30).

By AI Coding Tools Directory2026-04-305 min read
Last reviewed: 2026-04-30
ACTD
AI Coding Tools Directory

Editorial Team

The AI Coding Tools Directory editorial team researches and reviews AI-powered development tools to help developers find the best solutions for their workflows.

April 2026 was a substantive month for the Gemini API, with five changes worth pulling forward: Flex and Priority inference tiers (Apr 1), gemini-3.1-flash-tts-preview (Apr 15), a major Deep Research agent update with MCP server integration (Apr 21), gemini-embedding-2 GA (Apr 22), and the gemini-robotics-er-1.5-preview shutdown (Apr 30) in favor of 1.6-preview. Three of these — tiers, Deep Research + MCP, and embedding-2 — directly touch developer-tooling workflows.

TL;DR

  • Apr 1Flex and Priority inference tiers introduced; pick per request to trade off cost vs latency.
  • Apr 15gemini-3.1-flash-tts-preview released: cost-efficient, expressive, steerable TTS.
  • Apr 21Deep Research agent gains collaborative planning, visualization, MCP server integration, and File Search; new model variants deep-research-preview-04-2026 (speed/streaming) and deep-research-max-preview-04-2026 (max research).
  • Apr 22gemini-embedding-2 reaches GA.
  • Apr 30gemini-robotics-er-1.5-preview shut down; migrate to gemini-robotics-er-1.6-preview.

Quick Answer

If you ship a Gemini-backed app, three of these are immediately actionable. (1) Flex / Priority lets you cost-tune per request without code surgery — useful if you have mixed batch and interactive traffic on the same key. (2) Deep Research + MCP integration means an agent can pull from your MCP-described tools natively inside the Gemini Deep Research workflow — a real lift for builders already on the MCP ecosystem. (3) gemini-embedding-2 GA means you can move embedding workloads off embedding-1 without preview-status caveats.

Flex and Priority Inference Tiers (Apr 1, 2026)

Google introduced Flex and Priority inference tiers across Gemini API on April 1. The model is the same; what changes is the serving SLA and the price point.

Tier When to use
Flex Background batch work, async pipelines, evals, cost-sensitive workloads where some queueing is acceptable.
Priority Interactive UX, real-time agents, latency-bound chat or coding sessions where p99 matters.
Standard Existing default behavior.

Per-tier pricing and exact latency targets are surfaced in the official changelog. The shape of this is similar to what AWS Bedrock has been doing with X-Amzn-Bedrock-Service-Tier — a sign the whole industry is converging on per-request tiering rather than only per-model price.

gemini-3.1-flash-tts-preview (Apr 15, 2026)

A new TTS preview rooted in the Gemini 3.1 Flash family. Google's framing: cost-efficient, expressive, and steerable. As a preview model:

  • Expect the API surface to evolve before GA.
  • Good fit for prototyping voice features, narrating agent output, or building accessibility flows on top of Gemini-driven coding tools.
  • Avoid for compliance-bound production until it ships GA — preview models can change input/output shape without notice.

Deep Research Agent: Collaborative Planning, Visualization, MCP, File Search (Apr 21, 2026)

The headline coding-agent change of the month. The Deep Research agent picked up four capabilities:

  1. Collaborative planning — multi-step plans that incorporate user feedback mid-flight rather than running to completion blind.
  2. Visualization support — research outputs include rendered visualizations, not just text.
  3. MCP server integration — Deep Research can call tools exposed via Model Context Protocol servers, the same standard surfaced in Claude Code and Cursor.
  4. File Search — first-party document grounding inside the agent loop.

Two new model variants ship alongside:

Cursor logo
CursorFreemium

The AI-native code editor with $1B+ ARR, 25+ models, and background agents on dedicated VMs

Claude Code logo
Claude CodeSubscription

Anthropic's terminal-based AI coding agent with Claude Opus 4.7, /ultrareview, Routines, /ultraplan, and 80.9% SWE-bench

  • deep-research-preview-04-2026 — optimized for speed and client-side streaming. Use for interactive UX where you want the research to feel live.
  • deep-research-max-preview-04-2026 — optimized for comprehensive automated research. Use when wall-clock time matters less than depth.

The MCP move is the structurally important one. Three of the major coding-agent stacks — Claude Code, Cursor, and now Gemini Deep Research — speak MCP, which means an MCP server you build (for your internal docs, your CI, your ticketing) becomes a portable plugin across vendors. We covered the MCP surface in detail in the complete guide to MCP servers.

gemini-embedding-2 Reaches GA (Apr 22, 2026)

gemini-embedding-2 graduated from preview to general availability on April 22, 2026. For developer workflows that depend on embeddings — RAG over codebases, semantic code search, doc-grounded chat — this lifts the production caveat that comes with preview models. If you've been holding embedding-1 traffic in production while testing -2 in staging, you can now flip the switch.

gemini-robotics-er-1.5-preview Shutdown (Apr 30, 2026)

Less relevant for AI coding tools, but worth noting for completeness: gemini-robotics-er-1.5-preview was retired on April 30, 2026. Migrate to gemini-robotics-er-1.6-preview, which is the supported successor.

How This Changes the Vendor Picture

Capability Gemini API (April 2026) OpenAI Codex / GPT-5.5 Anthropic / Claude Opus 4.7
Per-request inference tiers Flex / Priority (Apr 1) Service-tier headers (existing) Service-tier on Bedrock (ANTHROPIC_BEDROCK_SERVICE_TIER)
Native MCP integration in agent stack Deep Research (Apr 21) Codex CLI (existing) Claude Code (native)
GA-grade embeddings model gemini-embedding-2 (Apr 22) text-embedding-3-large n/a
Latest research/agent-research mode deep-research-preview-04-2026 and -max-preview-04-2026 Deep Research GPT-5.5 n/a

For frontier-coding model context, see our OpenAI Codex GPT-5.5 update and Claude Opus 4.7 release post.

OpenAI Codex logo
OpenAI CodexFreemium

Cloud coding agent with GPT-5.5 frontier model, 1M+ developers, Desktop App, in-app browser use, and parallel sandboxed environments

Sources


For broader landscape context see our gpt-5-1-codex-max-vs-gemini-3-pro">Opus 4.5 vs GPT-5.1 Codex Max vs Gemini 3 Pro comparison and the competitive landscape of AI software development.

Free Resource

2026 AI Coding Tools Comparison Chart

Side-by-side comparison of features, pricing, and capabilities for every major AI coding tool.

No spam, unsubscribe anytime.

Frequently Asked Questions

What are the new Flex and Priority inference tiers?
Flex and Priority are new inference tiers Google launched April 1, 2026 for the Gemini API. They expand deployment options along the cost-versus-latency axis — Flex for cheaper, more elastic workloads; Priority for guaranteed low-latency serving. Pick a tier per request to match your workload's tolerance.
What changed for the Deep Research agent in April 2026?
On April 21, 2026 Deep Research received collaborative planning, visualization support, MCP server integration, and File Search. Two new model versions launched alongside: deep-research-preview-04-2026 (optimized for speed and client-side streaming) and deep-research-max-preview-04-2026 (designed for comprehensive automated research).
Is gemini-embedding-2 generally available now?
Yes. On April 22, 2026, gemini-embedding-2 graduated from preview to general availability. You can use it for production workloads with the standard GA support and stability commitments.
What is gemini-3.1-flash-tts-preview?
Released April 15, 2026, gemini-3.1-flash-tts-preview is a cost-efficient, expressive, and steerable text-to-speech model. As a preview, expect API surface evolution before GA — fine for prototypes, treat carefully for production.
What happened to gemini-robotics-er-1.5-preview?
It was shut down on April 30, 2026. Migrate to gemini-robotics-er-1.6-preview, which is the supported successor.