← Back to cookbooks

Intermediate · 4-8 hours

Mastering OpenAI Codex CLI — Skills, MCPs & Workflows

Master OpenAI Codex CLI — agents.md skills, MCP integrations, and advanced workflows.

Last reviewed Feb 27, 2026

Welcome to the OpenAI Codex CLI cookbook — a complete, practical guide to getting the most out of OpenAI's terminal-based agentic coding assistant. Whether you're generating features, debugging complex bugs, or automating entire development workflows, this cookbook covers the skills, configurations, MCP integrations, and battle-tested patterns used by top developers.

Codex CLI is OpenAI's open-source agentic coding assistant — a Rust-powered terminal tool that connects frontier reasoning models (GPT-5.3-Codex, o3, o4-mini) directly to your codebase, file system, and terminal. Launched April 2025, it now powers everything from solo debugging sessions to multi-agent enterprise workflows.


What is Codex CLI

Codex CLI is an open-source, terminal-based AI coding agent from OpenAI. Unlike cloud-based chat interfaces, it runs locally and operates directly on your files and terminal. It uses an agent loop — the model reasons about your request, executes shell commands and file patches, reads the results, and iterates until the task is complete. Key Facts

  • Package: @openai/codex (npm) or brew install codex
  • GitHub: github.com/openai/codex (Apache 2.0)
  • Written in Rust (rewritten from Node.js mid-2025)
  • Default model: GPT-5.3-Codex (272K input / 128K output tokens)
  • Authentication: OAuth via codex auth, API key, or ChatGPT plan login Installation
# npm (global)
npm install -g @openai/codex

# Homebrew (macOS)
brew install codex

# First run
codex

# Or with API key
export OPENAI_API_KEY="sk-..."
codex

The Codex Ecosystem (2026)

Surface Description
Codex CLI Terminal-based agent, local execution
Codex Web Cloud-based async agent in ChatGPT sidebar
Codex Desktop App macOS native app with multi-agent worktrees
Codex IDE Extension VS Code, Cursor, Windsurf extensions
Codex as MCP Server Expose Codex as a tool for other agents

Core Skills and Capabilities

Code Generation

Write entire functions, modules, or applications from natural language. Codex adapts to your existing project structure and conventions.

codex "create a REST API endpoint for user authentication using Express"
codex -i design.png "implement this UI design as React components"

Codebase Exploration

Navigate and understand large or legacy codebases using semantic search.

codex "explain how authentication works in this codebase"
codex "find where the auth token is set in this project"

Refactoring

Rename files, update imports, convert patterns across multiple files atomically.

codex --full-auto "convert all .js files in src/ to TypeScript"
codex --full-auto "rename all instances of UserModel to User across the codebase"

Test Writing

Generate unit tests, integration tests, and edge case coverage. Run test suites iteratively and fix failures in a loop.

codex "write comprehensive pytest tests for the payment module including edge cases"
codex --full-auto "run all tests and fix any failing ones"

Debugging

Read error logs and stack traces, trace failures across modules, apply targeted fixes and validate them.

codex "fix the TypeError in src/utils.js line 42"
codex "debug this test failure for the calculateTotal function"

Multi-File Editing

Uses apply_patch to modify multiple files atomically — keeping changes consistent across related files.

Code Review

Analyze code for bugs, logic errors, and security issues. Run headless in CI/CD.

codex exec --sandbox read-only "review the changes in src/ for security issues and code quality"

Documentation Generation

Generate docstrings, README updates, changelogs, and API docs from code.

Long-Running Sessions

Sustain tasks for up to 7 hours continuously — useful for large refactors and full feature implementations.


AGENTS.md — Project Instructions

AGENTS.md is the primary mechanism for giving Codex consistent, project-specific instructions. It's read before every task and acts like a README written for AI agents. Discovery order (highest to lowest priority):

  1. ~/.codex/AGENTS.override.md — Global temporary override
  2. ~/.codex/AGENTS.md — Global persistent defaults
  3. <git-root>/AGENTS.md → subdirectory files (closest to cwd wins) Creating an AGENTS.md:
# Option 1: Let Codex generate it
codex /init

# Option 2: Write manually
mkdir -p ~/.codex

Example AGENTS.md:

# Project: MyApp API

## Architecture
- Backend: Node.js + Express + TypeScript
- Database: PostgreSQL via Prisma ORM
- Tests: Jest with supertest

## Build & Test Commands
- Build: `npm run build`
- Tests: `npm test -- --maxfail=1 --no-coverage`
- Lint: `npm run lint`

## Code Style
- Use arrow functions, not `function` keyword
- Prefer `const` over `let`
- All async functions must have explicit error handling
- Add JSDoc comments to all exported functions

## Security Rules
- Never log sensitive data (passwords, tokens, PII)
- Sanitize all external inputs before parsing
- Use parameterized queries, never string interpolation in SQL

Approval and Sandbox Modes

Codex provides escalating trust levels that control agent autonomy:

Mode File Edits Commands Best For
read-only Blocked Blocked Audits, code review
suggest (default) Requires approval Requires approval Production code, learning
auto-edit Auto-approved Requires approval Active development
full-auto Auto-approved Auto-approved Daily dev (no network)
# Read-only (audits)
codex -s read-only "analyze the auth module for security issues"

# Full-auto (safe daily dev)
codex --full-auto "run tests and fix all failures"

# Full-auto with network
codex -a never -s workspace-write \
  -c 'sandbox_workspace_write.network_access=true' \
  "update all npm packages"

# Switch mid-session
/mode full-auto

Popular MCP Integrations

MCP (Model Context Protocol) connects Codex to external tools and services through a standardized protocol. Adding MCP Servers:

# CLI command
codex mcp add context7 -- npx -y @upstash/context7-mcp

# With env vars
codex mcp add my-server --env API_KEY=myvalue -- my-server-command

# List / manage
codex mcp list
codex mcp get context7
codex mcp remove context7

Config file (~/.codex/config.toml):

[mcp_servers.context7]
command = "npx"
args = ["-y", "@upstash/context7-mcp"]
startup_timeout_sec = 20

[mcp_servers.figma]
url = "https://mcp.figma.com/mcp"
bearer_token_env_var = "FIGMA_OAUTH_TOKEN"

Most Popular MCP Servers for Codex:

MCP Server Purpose Install
Context7 Live library documentation codex mcp add context7 -- npx -y @upstash/context7-mcp
Figma Design-to-code translation Remote: url = "https://mcp.figma.com/mcp"
GitHub PR, issue, and repo management codex mcp add github -- npx -y @modelcontextprotocol/server-github
Playwright Browser automation and testing Via npx command
Sentry Error logs and debugging Configure with Sentry token
Linear Project tracking and bug triage url = "https://mcp.linear.app/mcp"
Chrome DevTools Browser inspection and control HTTP-based server
Running Codex as an MCP Server:
Codex can expose itself as a tool for other agents — enabling multi-agent orchestration:
codex mcp-server

Workflow Patterns

Pattern 1: Safe Refactor Loop

# Step 1: Preview-only
codex -s read-only "refactor the auth module to use JWT"
# Review with /diff
# Step 2: Apply
codex --full-auto "apply the JWT refactoring"

Pattern 2: Test-Fix Loop

codex --full-auto "run all tests, fix failing ones, repeat until green"

Pattern 3: Code Review Before Commit

codex exec --sandbox read-only "review my uncommitted changes for bugs and security issues"

Pattern 4: CI/CD Integration (Headless)

codex exec --full-auto "run tests and report results" 2>&1 | tee output.log

# JSON output for parsing
codex exec --json --output-last-message summary.txt \
  "analyze changed files and generate a changelog entry"

Pattern 5: Long-Running Feature Development

codex --full-auto "implement the complete user auth system:
1. Register/login/logout endpoints
2. JWT token generation and validation
3. Password hashing with bcrypt
4. Tests for all endpoints
5. Update API documentation"

Debugging Workflows

Error Explanation

codex "explain this error: TypeError: Cannot read properties of undefined (reading 'map') at src/components/List.jsx line 23"

Multi-File Bug Tracing

codex "trace the data flow from the API request to the database insert in the orders module"

Iterative Fix-Verify Loop

codex --full-auto "the calculateTotal function returns NaN when discount is 0. Find and fix the bug."

Log File Analysis

cat server.log | codex exec - "analyze these logs and identify the root cause of the 500 errors"

Codex Diagnostic Logs

Codex writes to ~/.codex/log/codex-tui.log. Use /feedback inside a session for instant diagnostics including Request ID for bug reports.

# Debug logging
RUST_LOG=debug codex

# Tail logs
tail -F ~/.codex/log/codex-tui.log

Configuration Reference

Full config.toml

# ~/.codex/config.toml

# Model
model = "gpt-5.3-codex"
model_reasoning_effort = "medium"

# Approval & Sandbox
approval_policy = "on-request"
sandbox_mode = "workspace-write"

[sandbox_workspace_write]
network_access = false

# History
[history]
persistence = "save-all"

# Project Instructions
project_doc_max_bytes = 65536

# Profiles
[profiles.auto]
sandbox_mode = "danger-full-access"
approval_policy = "never"

[profiles.safe]
sandbox_mode = "read-only"

# UI
file_opener = "vscode"

Key Environment Variables

Variable Purpose
OPENAI_API_KEY API authentication
CODEX_HOME Override ~/.codex directory
OPENAI_BASE_URL Override API base URL
RUST_LOG=debug Enable verbose logging

Prompt Engineering Tips

  1. State intent, not syntax — "Add proper error handling to the payment flow" beats "wrap everything in try-catch"
  2. Provide full context — Include error messages verbatim, specify versions
  3. Break complex tasks — "Locate the bug" → "Propose fix" → "Write a test"
  4. Use @file mentions — Reference specific files to focus context
  5. **Leverage **AGENTS.md — Pre-load project conventions
  6. Specify acceptance criteria — "Implement X such that npm test passes"
  7. Use /compact during long sessions — Reclaim token budget
  8. Start new sessions for unrelated tasks — Avoid context contamination

Limitations and Workarounds

Limitation Workaround
Context window exhaustion (most common issue) Start fresh sessions; chunk large tasks; use /compact
Opaque usage limits on ChatGPT plans Switch to API key for predictable billing
No persistent memory across sessions Use Memory MCP server or detailed AGENTS.md
Large codebase token burn Use @file mentions; start from specific subdirectories
Token burn on retry loops Add to AGENTS.md: "Stop after 2 failed attempts"
.git/ read-only in workspace-write Grant network: -c 'sandbox_workspace_write.network_access=true'

Command Cheatsheet

# Interactive session
codex

# With prompt
codex "explain what this codebase does"

# With image
codex -i screenshot.png "implement this design"

# Non-interactive (CI)
codex exec "run tests and fix failures"

# Switch model
codex -m o3 "solve this hard problem"

# Full-auto mode
codex --full-auto "..."

# MCP management
codex mcp add / list / get / remove

# TUI commands
/init     — Generate AGENTS.md
/compact  — Compress conversation
/diff     — Show pending changes
/status   — Show session info
/model    — Switch model
/mcp      — Show active MCP servers

Last updated: February 27, 2026 Built for authorityaitools.com — AI Coding Tools Directory