Intermediate · 4-8 hours

Mastering OpenAI Codex CLI — Skills, MCPs & Workflows

Master OpenAI Codex CLI — agents.md skills, MCP integrations, and advanced workflows.

Last reviewed Feb 27, 2026

Welcome to the OpenAI Codex CLI cookbook — a complete, practical guide to getting the most out of OpenAI's terminal-based agentic coding assistant. Whether you're generating features, debugging complex bugs, or automating entire development workflows, this cookbook covers the skills, configurations, MCP integrations, and battle-tested patterns used by top developers.

Codex CLI is OpenAI's open-source agentic coding assistant — a Rust-powered terminal tool that connects frontier reasoning models (GPT-5.3-Codex, o3, o4-mini) directly to your codebase, file system, and terminal. Launched April 2025, it now powers everything from solo debugging sessions to multi-agent enterprise workflows.

What is Codex CLI

Codex CLI is an open-source, terminal-based AI coding agent from OpenAI. Unlike cloud-based chat interfaces, it runs locally and operates directly on your files and terminal. It uses an agent loop — the model reasons about your request, executes shell commands and file patches, reads the results, and iterates until the task is complete. Key Facts

Package: @openai/codex (npm) or brew install codex
GitHub: github.com/openai/codex (Apache 2.0)
Written in Rust (rewritten from Node.js mid-2025)
Default model: GPT-5.3-Codex (272K input / 128K output tokens)
Authentication: OAuth via codex auth, API key, or ChatGPT plan login Installation

# npm (global)
npm install -g @openai/codex

# Homebrew (macOS)
brew install codex

# First run
codex

# Or with API key
export OPENAI_API_KEY="sk-..."
codex

The Codex Ecosystem (2026)

Surface	Description
Codex CLI	Terminal-based agent, local execution
Codex Web	Cloud-based async agent in ChatGPT sidebar
Codex Desktop App	macOS native app with multi-agent worktrees
Codex IDE Extension	VS Code, Cursor, Windsurf extensions
Codex as MCP Server	Expose Codex as a tool for other agents

Core Skills and Capabilities

Code Generation

Write entire functions, modules, or applications from natural language. Codex adapts to your existing project structure and conventions.

codex "create a REST API endpoint for user authentication using Express"
codex -i design.png "implement this UI design as React components"

Codebase Exploration

Navigate and understand large or legacy codebases using semantic search.

codex "explain how authentication works in this codebase"
codex "find where the auth token is set in this project"

Refactoring

Rename files, update imports, convert patterns across multiple files atomically.

codex --full-auto "convert all .js files in src/ to TypeScript"
codex --full-auto "rename all instances of UserModel to User across the codebase"

Test Writing

Generate unit tests, integration tests, and edge case coverage. Run test suites iteratively and fix failures in a loop.

codex "write comprehensive pytest tests for the payment module including edge cases"
codex --full-auto "run all tests and fix any failing ones"

Debugging

Read error logs and stack traces, trace failures across modules, apply targeted fixes and validate them.

codex "fix the TypeError in src/utils.js line 42"
codex "debug this test failure for the calculateTotal function"

Multi-File Editing

Uses apply_patch to modify multiple files atomically — keeping changes consistent across related files.

Code Review

Analyze code for bugs, logic errors, and security issues. Run headless in CI/CD.

codex exec --sandbox read-only "review the changes in src/ for security issues and code quality"

Documentation Generation

Generate docstrings, README updates, changelogs, and API docs from code.

Long-Running Sessions

Sustain tasks for up to 7 hours continuously — useful for large refactors and full feature implementations.

AGENTS.md — Project Instructions

AGENTS.md is the primary mechanism for giving Codex consistent, project-specific instructions. It's read before every task and acts like a README written for AI agents. Discovery order (highest to lowest priority):

~/.codex/AGENTS.override.md — Global temporary override
~/.codex/AGENTS.md — Global persistent defaults
<git-root>/AGENTS.md → subdirectory files (closest to cwd wins) Creating an AGENTS.md:

# Option 1: Let Codex generate it
codex /init

# Option 2: Write manually
mkdir -p ~/.codex

Example AGENTS.md:

# Project: MyApp API

## Architecture
- Backend: Node.js + Express + TypeScript
- Database: PostgreSQL via Prisma ORM
- Tests: Jest with supertest

## Build & Test Commands
- Build: `npm run build`
- Tests: `npm test -- --maxfail=1 --no-coverage`
- Lint: `npm run lint`

## Code Style
- Use arrow functions, not `function` keyword
- Prefer `const` over `let`
- All async functions must have explicit error handling
- Add JSDoc comments to all exported functions

## Security Rules
- Never log sensitive data (passwords, tokens, PII)
- Sanitize all external inputs before parsing
- Use parameterized queries, never string interpolation in SQL

Approval and Sandbox Modes

Codex provides escalating trust levels that control agent autonomy:

Mode	File Edits	Commands	Best For
read-only	Blocked	Blocked	Audits, code review
suggest (default)	Requires approval	Requires approval	Production code, learning
auto-edit	Auto-approved	Requires approval	Active development
full-auto	Auto-approved	Auto-approved	Daily dev (no network)

# Read-only (audits)
codex -s read-only "analyze the auth module for security issues"

# Full-auto (safe daily dev)
codex --full-auto "run tests and fix all failures"

# Full-auto with network
codex -a never -s workspace-write \
  -c 'sandbox_workspace_write.network_access=true' \
  "update all npm packages"

# Switch mid-session
/mode full-auto

Popular MCP Integrations

MCP (Model Context Protocol) connects Codex to external tools and services through a standardized protocol. Adding MCP Servers:

# CLI command
codex mcp add context7 -- npx -y @upstash/context7-mcp

# With env vars
codex mcp add my-server --env API_KEY=myvalue -- my-server-command

# List / manage
codex mcp list
codex mcp get context7
codex mcp remove context7

Config file (~/.codex/config.toml):

[mcp_servers.context7]
command = "npx"
args = ["-y", "@upstash/context7-mcp"]
startup_timeout_sec = 20

[mcp_servers.figma]
url = "https://mcp.figma.com/mcp"
bearer_token_env_var = "FIGMA_OAUTH_TOKEN"

Most Popular MCP Servers for Codex:

MCP Server	Purpose	Install
Context7	Live library documentation	`codex mcp add context7 -- npx -y @upstash/context7-mcp`
Figma	Design-to-code translation	Remote: `url = "https://mcp.figma.com/mcp"`
GitHub	PR, issue, and repo management	`codex mcp add github -- npx -y @modelcontextprotocol/server-github`
Playwright	Browser automation and testing	Via npx command
Sentry	Error logs and debugging	Configure with Sentry token
Linear	Project tracking and bug triage	`url = "https://mcp.linear.app/mcp"`
Chrome DevTools	Browser inspection and control	HTTP-based server
Running Codex as an MCP Server:
Codex can expose itself as a tool for other agents — enabling multi-agent orchestration:

codex mcp-server

Workflow Patterns

Pattern 1: Safe Refactor Loop

# Step 1: Preview-only
codex -s read-only "refactor the auth module to use JWT"
# Review with /diff
# Step 2: Apply
codex --full-auto "apply the JWT refactoring"

Pattern 2: Test-Fix Loop

codex --full-auto "run all tests, fix failing ones, repeat until green"

Pattern 3: Code Review Before Commit

codex exec --sandbox read-only "review my uncommitted changes for bugs and security issues"

Pattern 4: CI/CD Integration (Headless)

codex exec --full-auto "run tests and report results" 2>&1 | tee output.log

# JSON output for parsing
codex exec --json --output-last-message summary.txt \
  "analyze changed files and generate a changelog entry"

Pattern 5: Long-Running Feature Development

codex --full-auto "implement the complete user auth system:
1. Register/login/logout endpoints
2. JWT token generation and validation
3. Password hashing with bcrypt
4. Tests for all endpoints
5. Update API documentation"

Debugging Workflows

Error Explanation

codex "explain this error: TypeError: Cannot read properties of undefined (reading 'map') at src/components/List.jsx line 23"

Multi-File Bug Tracing

codex "trace the data flow from the API request to the database insert in the orders module"

Iterative Fix-Verify Loop

codex --full-auto "the calculateTotal function returns NaN when discount is 0. Find and fix the bug."

Log File Analysis

cat server.log | codex exec - "analyze these logs and identify the root cause of the 500 errors"

Codex Diagnostic Logs

Codex writes to ~/.codex/log/codex-tui.log. Use /feedback inside a session for instant diagnostics including Request ID for bug reports.

# Debug logging
RUST_LOG=debug codex

# Tail logs
tail -F ~/.codex/log/codex-tui.log

Configuration Reference

Full config.toml

# ~/.codex/config.toml

# Model
model = "gpt-5.3-codex"
model_reasoning_effort = "medium"

# Approval & Sandbox
approval_policy = "on-request"
sandbox_mode = "workspace-write"

[sandbox_workspace_write]
network_access = false

# History
[history]
persistence = "save-all"

# Project Instructions
project_doc_max_bytes = 65536

# Profiles
[profiles.auto]
sandbox_mode = "danger-full-access"
approval_policy = "never"

[profiles.safe]
sandbox_mode = "read-only"

# UI
file_opener = "vscode"

Key Environment Variables

Variable	Purpose
OPENAI_API_KEY	API authentication
CODEX_HOME	Override ~/.codex directory
OPENAI_BASE_URL	Override API base URL
RUST_LOG=debug	Enable verbose logging

Prompt Engineering Tips

State intent, not syntax — "Add proper error handling to the payment flow" beats "wrap everything in try-catch"
Provide full context — Include error messages verbatim, specify versions
Break complex tasks — "Locate the bug" → "Propose fix" → "Write a test"
Use @file mentions — Reference specific files to focus context
**Leverage **AGENTS.md — Pre-load project conventions
Specify acceptance criteria — "Implement X such that npm test passes"
Use /compact during long sessions — Reclaim token budget
Start new sessions for unrelated tasks — Avoid context contamination

Limitations and Workarounds

Limitation	Workaround
Context window exhaustion (most common issue)	Start fresh sessions; chunk large tasks; use `/compact`
Opaque usage limits on ChatGPT plans	Switch to API key for predictable billing
No persistent memory across sessions	Use Memory MCP server or detailed AGENTS.md
Large codebase token burn	Use @file mentions; start from specific subdirectories
Token burn on retry loops	Add to AGENTS.md: "Stop after 2 failed attempts"
`.git/` read-only in workspace-write	Grant network: `-c 'sandbox_workspace_write.network_access=true'`

Command Cheatsheet

# Interactive session
codex

# With prompt
codex "explain what this codebase does"

# With image
codex -i screenshot.png "implement this design"

# Non-interactive (CI)
codex exec "run tests and fix failures"

# Switch model
codex -m o3 "solve this hard problem"

# Full-auto mode
codex --full-auto "..."

# MCP management
codex mcp add / list / get / remove

# TUI commands
/init     — Generate AGENTS.md
/compact  — Compress conversation
/diff     — Show pending changes
/status   — Show session info
/model    — Switch model
/mcp      — Show active MCP servers

Last updated: February 27, 2026 Built for authorityaitools.com — AI Coding Tools Directory