Intermediate · 4-8 hours
Mastering OpenAI Codex CLI — Skills, MCPs & Workflows
Master OpenAI Codex CLI — agents.md skills, MCP integrations, and advanced workflows.
Last reviewed Feb 27, 2026
Welcome to the OpenAI Codex CLI cookbook — a complete, practical guide to getting the most out of OpenAI's terminal-based agentic coding assistant. Whether you're generating features, debugging complex bugs, or automating entire development workflows, this cookbook covers the skills, configurations, MCP integrations, and battle-tested patterns used by top developers.
Codex CLI is OpenAI's open-source agentic coding assistant — a Rust-powered terminal tool that connects frontier reasoning models (GPT-5.3-Codex, o3, o4-mini) directly to your codebase, file system, and terminal. Launched April 2025, it now powers everything from solo debugging sessions to multi-agent enterprise workflows.
What is Codex CLI
Codex CLI is an open-source, terminal-based AI coding agent from OpenAI. Unlike cloud-based chat interfaces, it runs locally and operates directly on your files and terminal. It uses an agent loop — the model reasons about your request, executes shell commands and file patches, reads the results, and iterates until the task is complete. Key Facts
- Package:
@openai/codex(npm) orbrew install codex - GitHub: github.com/openai/codex (Apache 2.0)
- Written in Rust (rewritten from Node.js mid-2025)
- Default model: GPT-5.3-Codex (272K input / 128K output tokens)
- Authentication: OAuth via
codex auth, API key, or ChatGPT plan login Installation
# npm (global)
npm install -g @openai/codex
# Homebrew (macOS)
brew install codex
# First run
codex
# Or with API key
export OPENAI_API_KEY="sk-..."
codex
The Codex Ecosystem (2026)
| Surface | Description |
|---|---|
| Codex CLI | Terminal-based agent, local execution |
| Codex Web | Cloud-based async agent in ChatGPT sidebar |
| Codex Desktop App | macOS native app with multi-agent worktrees |
| Codex IDE Extension | VS Code, Cursor, Windsurf extensions |
| Codex as MCP Server | Expose Codex as a tool for other agents |
Core Skills and Capabilities
Code Generation
Write entire functions, modules, or applications from natural language. Codex adapts to your existing project structure and conventions.
codex "create a REST API endpoint for user authentication using Express"
codex -i design.png "implement this UI design as React components"
Codebase Exploration
Navigate and understand large or legacy codebases using semantic search.
codex "explain how authentication works in this codebase"
codex "find where the auth token is set in this project"
Refactoring
Rename files, update imports, convert patterns across multiple files atomically.
codex --full-auto "convert all .js files in src/ to TypeScript"
codex --full-auto "rename all instances of UserModel to User across the codebase"
Test Writing
Generate unit tests, integration tests, and edge case coverage. Run test suites iteratively and fix failures in a loop.
codex "write comprehensive pytest tests for the payment module including edge cases"
codex --full-auto "run all tests and fix any failing ones"
Debugging
Read error logs and stack traces, trace failures across modules, apply targeted fixes and validate them.
codex "fix the TypeError in src/utils.js line 42"
codex "debug this test failure for the calculateTotal function"
Multi-File Editing
Uses apply_patch to modify multiple files atomically — keeping changes consistent across related files.
Code Review
Analyze code for bugs, logic errors, and security issues. Run headless in CI/CD.
codex exec --sandbox read-only "review the changes in src/ for security issues and code quality"
Documentation Generation
Generate docstrings, README updates, changelogs, and API docs from code.
Long-Running Sessions
Sustain tasks for up to 7 hours continuously — useful for large refactors and full feature implementations.
AGENTS.md — Project Instructions
AGENTS.md is the primary mechanism for giving Codex consistent, project-specific instructions. It's read before every task and acts like a README written for AI agents. Discovery order (highest to lowest priority):
~/.codex/AGENTS.override.md— Global temporary override~/.codex/AGENTS.md— Global persistent defaults<git-root>/AGENTS.md→ subdirectory files (closest to cwd wins) Creating an AGENTS.md:
# Option 1: Let Codex generate it
codex /init
# Option 2: Write manually
mkdir -p ~/.codex
Example AGENTS.md:
# Project: MyApp API
## Architecture
- Backend: Node.js + Express + TypeScript
- Database: PostgreSQL via Prisma ORM
- Tests: Jest with supertest
## Build & Test Commands
- Build: `npm run build`
- Tests: `npm test -- --maxfail=1 --no-coverage`
- Lint: `npm run lint`
## Code Style
- Use arrow functions, not `function` keyword
- Prefer `const` over `let`
- All async functions must have explicit error handling
- Add JSDoc comments to all exported functions
## Security Rules
- Never log sensitive data (passwords, tokens, PII)
- Sanitize all external inputs before parsing
- Use parameterized queries, never string interpolation in SQL
Approval and Sandbox Modes
Codex provides escalating trust levels that control agent autonomy:
| Mode | File Edits | Commands | Best For |
|---|---|---|---|
| read-only | Blocked | Blocked | Audits, code review |
| suggest (default) | Requires approval | Requires approval | Production code, learning |
| auto-edit | Auto-approved | Requires approval | Active development |
| full-auto | Auto-approved | Auto-approved | Daily dev (no network) |
# Read-only (audits)
codex -s read-only "analyze the auth module for security issues"
# Full-auto (safe daily dev)
codex --full-auto "run tests and fix all failures"
# Full-auto with network
codex -a never -s workspace-write \
-c 'sandbox_workspace_write.network_access=true' \
"update all npm packages"
# Switch mid-session
/mode full-auto
Popular MCP Integrations
MCP (Model Context Protocol) connects Codex to external tools and services through a standardized protocol. Adding MCP Servers:
# CLI command
codex mcp add context7 -- npx -y @upstash/context7-mcp
# With env vars
codex mcp add my-server --env API_KEY=myvalue -- my-server-command
# List / manage
codex mcp list
codex mcp get context7
codex mcp remove context7
Config file (~/.codex/config.toml):
[mcp_servers.context7]
command = "npx"
args = ["-y", "@upstash/context7-mcp"]
startup_timeout_sec = 20
[mcp_servers.figma]
url = "https://mcp.figma.com/mcp"
bearer_token_env_var = "FIGMA_OAUTH_TOKEN"
Most Popular MCP Servers for Codex:
| MCP Server | Purpose | Install |
|---|---|---|
| Context7 | Live library documentation | codex mcp add context7 -- npx -y @upstash/context7-mcp |
| Figma | Design-to-code translation | Remote: url = "https://mcp.figma.com/mcp" |
| GitHub | PR, issue, and repo management | codex mcp add github -- npx -y @modelcontextprotocol/server-github |
| Playwright | Browser automation and testing | Via npx command |
| Sentry | Error logs and debugging | Configure with Sentry token |
| Linear | Project tracking and bug triage | url = "https://mcp.linear.app/mcp" |
| Chrome DevTools | Browser inspection and control | HTTP-based server |
| Running Codex as an MCP Server: | ||
| Codex can expose itself as a tool for other agents — enabling multi-agent orchestration: |
codex mcp-server
Workflow Patterns
Pattern 1: Safe Refactor Loop
# Step 1: Preview-only
codex -s read-only "refactor the auth module to use JWT"
# Review with /diff
# Step 2: Apply
codex --full-auto "apply the JWT refactoring"
Pattern 2: Test-Fix Loop
codex --full-auto "run all tests, fix failing ones, repeat until green"
Pattern 3: Code Review Before Commit
codex exec --sandbox read-only "review my uncommitted changes for bugs and security issues"
Pattern 4: CI/CD Integration (Headless)
codex exec --full-auto "run tests and report results" 2>&1 | tee output.log
# JSON output for parsing
codex exec --json --output-last-message summary.txt \
"analyze changed files and generate a changelog entry"
Pattern 5: Long-Running Feature Development
codex --full-auto "implement the complete user auth system:
1. Register/login/logout endpoints
2. JWT token generation and validation
3. Password hashing with bcrypt
4. Tests for all endpoints
5. Update API documentation"
Debugging Workflows
Error Explanation
codex "explain this error: TypeError: Cannot read properties of undefined (reading 'map') at src/components/List.jsx line 23"
Multi-File Bug Tracing
codex "trace the data flow from the API request to the database insert in the orders module"
Iterative Fix-Verify Loop
codex --full-auto "the calculateTotal function returns NaN when discount is 0. Find and fix the bug."
Log File Analysis
cat server.log | codex exec - "analyze these logs and identify the root cause of the 500 errors"
Codex Diagnostic Logs
Codex writes to ~/.codex/log/codex-tui.log. Use /feedback inside a session for instant diagnostics including Request ID for bug reports.
# Debug logging
RUST_LOG=debug codex
# Tail logs
tail -F ~/.codex/log/codex-tui.log
Configuration Reference
Full config.toml
# ~/.codex/config.toml
# Model
model = "gpt-5.3-codex"
model_reasoning_effort = "medium"
# Approval & Sandbox
approval_policy = "on-request"
sandbox_mode = "workspace-write"
[sandbox_workspace_write]
network_access = false
# History
[history]
persistence = "save-all"
# Project Instructions
project_doc_max_bytes = 65536
# Profiles
[profiles.auto]
sandbox_mode = "danger-full-access"
approval_policy = "never"
[profiles.safe]
sandbox_mode = "read-only"
# UI
file_opener = "vscode"
Key Environment Variables
| Variable | Purpose |
|---|---|
| OPENAI_API_KEY | API authentication |
| CODEX_HOME | Override ~/.codex directory |
| OPENAI_BASE_URL | Override API base URL |
| RUST_LOG=debug | Enable verbose logging |
Prompt Engineering Tips
- State intent, not syntax — "Add proper error handling to the payment flow" beats "wrap everything in try-catch"
- Provide full context — Include error messages verbatim, specify versions
- Break complex tasks — "Locate the bug" → "Propose fix" → "Write a test"
- Use @file mentions — Reference specific files to focus context
- **Leverage **AGENTS.md — Pre-load project conventions
- Specify acceptance criteria — "Implement X such that
npm testpasses" - Use
/compactduring long sessions — Reclaim token budget - Start new sessions for unrelated tasks — Avoid context contamination
Limitations and Workarounds
| Limitation | Workaround |
|---|---|
| Context window exhaustion (most common issue) | Start fresh sessions; chunk large tasks; use /compact |
| Opaque usage limits on ChatGPT plans | Switch to API key for predictable billing |
| No persistent memory across sessions | Use Memory MCP server or detailed AGENTS.md |
| Large codebase token burn | Use @file mentions; start from specific subdirectories |
| Token burn on retry loops | Add to AGENTS.md: "Stop after 2 failed attempts" |
.git/ read-only in workspace-write |
Grant network: -c 'sandbox_workspace_write.network_access=true' |
Command Cheatsheet
# Interactive session
codex
# With prompt
codex "explain what this codebase does"
# With image
codex -i screenshot.png "implement this design"
# Non-interactive (CI)
codex exec "run tests and fix failures"
# Switch model
codex -m o3 "solve this hard problem"
# Full-auto mode
codex --full-auto "..."
# MCP management
codex mcp add / list / get / remove
# TUI commands
/init — Generate AGENTS.md
/compact — Compress conversation
/diff — Show pending changes
/status — Show session info
/model — Switch model
/mcp — Show active MCP servers
Last updated: February 27, 2026 Built for authorityaitools.com — AI Coding Tools Directory
Related tools
Related cookbooks
AI-Powered Code Review & Quality
Automate code review and enforce quality standards using AI-powered tools and agentic workflows.
Building AI-Powered Applications
Build applications powered by LLMs, RAG, and AI agents using Claude Code, Cursor, and modern AI frameworks.
Building APIs & Backends with AI Agents
Design and build robust APIs and backend services with AI coding agents, from REST to GraphQL.