2025年6月5日 · 2 分钟阅读

Codex Goal 模式完全指南

Goals are persistent objectives in Codex that keep a thread working toward a defined outcome across turns. 核心概念 Goal 是 Codex 中的持久化目标，让线程持续朝定义的结果工作。Goa...

Goals are persistent objectives in Codex that keep a thread working toward a defined outcome across turns.

核心概念

Goal 是 Codex 中的持久化目标，让线程持续朝定义的结果工作。Goal 给 Codex 一个完成条件：什么应该为真、如何检查成功、哪些约束必须保持不变。

普通 Prompt vs Goal：

Prompt: ask → work → result → wait
Goal: work → check → continue or complete

普通 prompt 说"做这件事"。Goal 说"持续工作直到这个结果为真"。

何时使用 Goal

适用场景：任务有明确终点，但到达终点的路径不确定。

性能优化
Flaky test 调查
依赖迁移
Bug 搜索（需要复现）
多步重构
基准驱动的调优
研究任务（需要最终产物）

不适用场景：

单行编辑
简单解释
短代码审查
只需一个答案就停的问题
终点模糊的任务（"让它更好"）

安装与使用

# npm
npm install -g @openai/codex@latest
codex --version

# Homebrew
brew update
brew upgrade --cask codex
codex --version

需要 Codex 0.128.0+。

/goal <目标描述>      # 设置 Goal
/goal                 # 查看当前 Goal
/goal pause           # 暂停
/goal resume          # 恢复
/goal clear           # 清除

Goal 生命周期

Goal 激活后，Codex 可以检查代码、运行命令、做修改、测试结果，持续到满足停止条件。停止条件可以是：成功、暂停、清除、中断、预算耗尽、需要用户输入的阻塞点。

Active → (完成/暂停/清除/预算耗尽/阻塞) → Stopped
                    ↘ resume → Active

写出强 Goal 的六要素

要素	说明
Outcome	工作完成后什么应该为真
Verification surface	证明完成的测试/基准/报告/产物
Constraints	工作期间不能退化的东西
Boundaries	允许使用的文件/工具/数据/仓库
Iteration policy	每次尝试后如何选择下一步
Blocked stop condition	何时应停下报告"无可用路径"

模板：

/goal <期望终态> verified by <具体证据> while preserving <约束>. Use <允许的输入/工具/边界>. Between iterations, <如何选择下一步>. If blocked or no valid paths remain, <报告什么、需要什么输入才能继续>.

示例对比

弱 Goal：

/goal Improve performance

强 Goal：

/goal Reduce p95 checkout latency below 120 ms, verified by the checkout benchmark, while keeping the correctness suite green. Use only the checkout service, benchmark fixtures, and related tests. Between iterations, record what changed, what the benchmark showed, and the next best experiment to try. If the benchmark cannot run or no valid paths remain, stop with the attempted paths, the evidence gathered, the blocker, and the next input needed.

架构设计

Goal 是持久化的线程级状态（thread-scoped），不是全局记忆也不是项目级指令。目标属于线程，与相关上下文（文件、命令、diffs、日志、推理链）一起存在。

┌─────────────────────────────────┐
│         Thread Context          │
│  ┌───────────────────────────┐  │
│  │     Goal (durable state)  │  │
│  │  - Objective              │  │
│  │  - Lifecycle              │  │
│  │  - Budget accounting      │  │
│  │  - Progress tracking      │  │
│  └───────────────────────────┘  │
│  + files, commands, diffs,     │
│    logs, reasoning trail       │
└─────────────────────────────────┘

续行机制：事件驱动，只在安全边界检查——turn 完成后、无其他工作待处理时、无用户输入排队时、线程空闲时。

保守设计：

Plan-only 工作不触发续行
中断暂停目标
如果续行 turn 没有工具调用，下一次自动续行被抑制（防止空转）

研究型 Goal 示例：Deep Hedging 论文复现

弱研究 Goal：

/goal Reproduce Buehler et al., "Deep Hedging"

强研究 Goal：

/goal Produce the strongest evidence-backed reproduction of Buehler et al., "Deep Hedging," using the available paper materials and local resources. Attempt every headline result, verify the outputs, and end with a report that separates reproduced mechanics, approximate trained results, blocked exact replay, and remaining uncertainty.

最终报告应按不同证据层级分类：

Claim: Deep hedging approximates complete-market Heston hedge without transaction costs.
Route: Rebuilt model mechanics, reference hedge comparison, and trained neural policy.
Evidence surface: Price checks, histograms, and hedge surfaces.
Status: Close approximate reproduction.
Remaining uncertainty: Original training paths, seeds, and checkpoints are unavailable.

实战案例

Claire Vo（ChatPRD）的使用案例：

消除 Sentry 错误：5 小时自主运行，消除数千条错误日志
清理邮件：3,900 封邮件减至 68 封，不到 4 小时
整理 Linear 任务：数百个项目管理任务自动分类

核心洞察

Goal 不是没有边界的后台自主性。它是一个有范围的、用户可控的完成契约。你定义结果，Codex 根据线程中的证据工作，Goal 可以被暂停、恢复、清除、完成或因预算停止。

架构不是重点。重点在于 Goal 不只是要求 Codex 完成——它定义了"完成"意味着什么。

来源: @dkundel on X | OpenAI Cookbook