Layer 1: Agent Loop —— 核心引擎

核心问题

Agent 循环是整个系统的心脏。它不是一个简单的 while (true) { callAPI(); runTools(); }，而是一个带有多种恢复策略、状态转换、流式处理的复杂状态机。

1. 架构总览

                    ┌─────────────────────────┐
                    │     query() 入口         │
                    │  AsyncGenerator<Event>   │
                    └──────────┬──────────────┘
                               │
                    ┌──────────▼──────────────┐
                    │     queryLoop()          │
                    │     while (true) {       │
                    │       ...                │
                    │     }                    │
                    └──────────┬──────────────┘
                               │
          ┌────────────────────┼────────────────────┐
          ▼                    ▼                     ▼
   ┌─────────────┐   ┌───────────────┐    ┌───────────────┐
   │ STREAM 阶段  │   │  TOOL 执行     │    │  状态转换       │
   │ callModel() │   │  StreamExec   │    │  transitions  │
   └─────────────┘   └───────────────┘    └───────────────┘

关键文件

文件	职责	大小
`query.ts`	Agent 循环主逻辑	~69KB
`QueryEngine.ts`	查询引擎协调器	~47KB
`query/config.ts`	查询配置构建
`query/transitions.ts`	状态转换定义
`query/tokenBudget.ts`	Token 预算控制
`query/stopHooks.ts`	停止钩子
`services/api/claude.ts`	API 通信层
`utils/messages.ts`	流式消息重建

2. query() —— 入口点

// query.ts:219-239
export async function* query(params: QueryParams): AsyncGenerator<
  StreamEvent | RequestStartEvent | Message | TombstoneMessage | ToolUseSummaryMessage,
  Terminal  // 返回值是终止状态
> {
  const consumedCommandUuids: string[] = []
  const terminal = yield* queryLoop(params, consumedCommandUuids)
  // 通知已消费的命令
  for (const uuid of consumedCommandUuids) {
    notifyCommandLifecycle(uuid, 'completed')
  }
  return terminal
}

关键设计：query() 是一个 AsyncGenerator——它不返回最终结果，而是流式 yield 中间事件（流式文本、工具进度、消息等）。这让调用方（REPL/SDK）可以实时渲染进度。

3. 循环状态（Mutable State）

// query.ts:204-217
type State = {
  messages: Message[]                    // 当前消息历史
  toolUseContext: ToolUseContext          // 工具执行上下文
  autoCompactTracking: AutoCompactTrackingState | undefined
  maxOutputTokensRecoveryCount: number   // max_tokens 恢复次数
  hasAttemptedReactiveCompact: boolean   // 是否已尝试响应式压缩
  maxOutputTokensOverride: number | undefined
  pendingToolUseSummary: Promise<...>    // 工具使用摘要（异步）
  stopHookActive: boolean | undefined    // Stop hook 是否活跃
  turnCount: number                      // 当前轮次数
  transition: Continue | undefined       // 上一次迭代为什么继续
}

Transition Reason（转换原因）

每次循环继续时，必须记录为什么继续：

type ContinueReason =
  | 'collapse_drain_retry'       // 上下文折叠排水后重试
  | 'reactive_compact_retry'     // 响应式压缩后重试
  | 'max_output_tokens_escalate' // 升级 max_output_tokens (8K→64K)
  | 'max_output_tokens_recovery' // 多轮截断恢复
  | 'token_budget_continuation'  // 预算未用完，自动续写
  | 'next_turn'                  // 正常：工具结果返回，继续对话

★ 设计洞察：显式的 transition reason 是比隐式 continue 更好的设计。它让代码可读性更高（每个 continue 都有明确的语义），也让遥测更精确（可以统计每种 recovery 的频率）。

4. 循环的 10 个阶段

while (true) {
  // ═══════════ 阶段 1: API 调用 & 流式处理 ═══════════
  for await (const message of deps.callModel({
    messages, systemPrompt, tools, ...
  })) {
    // 处理流式事件：text, tool_use, thinking
    // StreamingToolExecutor 边收边执行
  }

  // ═══════════ 阶段 2: 工具摘要（异步） ═══════════
  // 用 Haiku 生成工具调用的可读摘要
  pendingToolUseSummary = generateSummary(toolUseBlocks)

  // ═══════════ 阶段 3: 中断检查 ═══════════
  if (aborted) {
    // 为未完成的工具生成合成 tool_result
    // 防止 API 报错（缺少 tool_result）
  }

  // ═══════════ 阶段 4: 错误恢复 ═══════════
  if (error413) {
    // Collapse drain → Reactive compact → Surface error
    // 每步都可能 continue（重试）
  }

  // ═══════════ 阶段 5: Max Output Tokens 恢复 ═══════════
  if (stopReason === 'max_tokens') {
    // 升级限制 → 多轮恢复 → continue
  }

  // ═══════════ 阶段 6: Stop Hooks ═══════════
  for await (const hookResult of runStopHooks(...)) {
    if (hookResult.blockingError) {
      // Hook 说"不要停止，继续对话"
      // 将 hook 输出注入为新的 user message
    }
  }

  // ═══════════ 阶段 7: 完成检查 ═══════════
  if (stopReason === 'end_turn' && 无工具调用) {
    // Token budget 检查
    if (budget < 90% && !diminishing) continue
    // 真正结束
    return Terminal
  }

  // ═══════════ 阶段 8: 工具执行 ═══════════
  // StreamingToolExecutor 已经在阶段 1 开始执行了
  // 这里收集剩余结果
  for await (const update of executor.getRemainingResults()) {
    messages.push(update.message)
  }

  // ═══════════ 阶段 9: 附件生成 ═══════════
  // 检查工具是否触发了文件变更
  // 注入 CLAUDE.md 记忆、Skill 变更、命令变更
  // 嵌套记忆附件（文件路径匹配 CLAUDE.md 的 paths: 规则）

  // ═══════════ 阶段 10: 状态更新 & 循环 ═══════════
  state = {
    messages: [...oldMessages, ...assistantMessages, ...toolResults],
    turnCount: turnCount + 1,
    transition: { reason: 'next_turn' },
  }
  continue  // → 回到阶段 1
}

5. 流式处理管线

API 调用层

// query.ts:659-707 — callModel 调用
for await (const message of deps.callModel({
  messages: prependUserContext(messagesForQuery, userContext),
  systemPrompt: fullSystemPrompt,
  tools: toolUseContext.options.tools,
  signal: toolUseContext.abortController.signal,
  options: {
    model: currentModel,
    fastMode: appState.fastMode,
    maxOutputTokensOverride,
    effortValue: appState.effortValue,
    // ...
  },
})) {
  // 处理每个流式事件
}

消息重建

// utils/messages.ts — handleMessageFromStream()
// 将 API 的增量 delta 重建为完整的 AssistantMessage

ContentBlockStartEvent  → 创建新的内容块（text/tool_use/thinking）
ContentBlockDeltaEvent  → 追加增量内容
  ├── TextDelta         → 累积文本
  ├── InputJsonDelta    → 累积工具输入 JSON
  └── ThinkingDelta     → 累积思考内容
MessageStopEvent        → 最终化消息

流式工具执行

API stream:  [text] ─── [tool_use:A] ─── [tool_use:B] ─── [end]
                              │                │
                              ▼                ▼
Executor:    ─────── [A开始执行] ─── [B开始执行] ─── [A完成] ─── [B完成]
                                    (并行，如果都是 concurrencySafe)

关键：不等 API 返回 [end] 就开始执行工具。工具执行和 API streaming 是并行的。

6. Fallback Model 机制

当主模型 streaming 失败时：

// query.ts:894-950
if (error instanceof FallbackTriggeredError) {
  // 1. Yield tombstones（标记丢失的 thinking blocks）
  // 2. 丢弃进行中的工具结果
  // 3. 剥离 signature blocks（thinking 签名是模型绑定的）
  // 4. 用 fallback 模型重试
  // 5. 记录遥测

  state = {
    messages: stripSignatureBlocks(messagesForQuery),
    transition: { reason: 'fallback_retry', fallbackModel },
  }
  continue
}

★ 设计洞察：thinking blocks 的 signature 是模型特定的。如果从模型 A fallback 到模型 B，模型 B 无法验证模型 A 的 thinking signature——所以必须剥离。这是一个容易被忽略的细节。

7. Stop Hook 机制

Agent 即将结束回复时，Stop hooks 可以让对话继续：

Agent 生成完整回复（stop_reason: 'end_turn'）
  │
  ├── 运行 Stop hooks
  │     ├── exit 0 → 正常结束
  │     └── exit 2 → "不要停止！" + stderr 作为新的 user message
  │
  └── Hook 阻塞了停止 → 继续对话
        └── 注入 hook 的 stderr 作为上下文
            → 回到循环顶部，继续调用 API

用例：

CI/CD Hook 检查测试是否全部通过，未通过就让 Agent 继续修复
代码审查 Hook 检查是否符合规范，不符合就让 Agent 继续修改

8. 完整的数据流图

User Input
  │
  ▼
┌──────────────────────────────────────────────────────────┐
│                    query() AsyncGenerator                 │
│                                                          │
│  ┌─────────────────────────────────────────────────────┐ │
│  │                while (true) {                        │ │
│  │                                                     │ │
│  │  ┌───────────┐    ┌──────────────┐                  │ │
│  │  │ callModel │───▶│ Stream Events│──▶ yield to UI   │ │
│  │  │ (API)     │    │ (text,tools) │                  │ │
│  │  └───────────┘    └──────┬───────┘                  │ │
│  │                          │                          │ │
│  │                   ┌──────▼───────┐                  │ │
│  │                   │ StreamExec   │                  │ │
│  │                   │ (并发执行)    │                  │ │
│  │                   └──────┬───────┘                  │ │
│  │                          │                          │ │
│  │              ┌───────────┼───────────┐              │ │
│  │              ▼           ▼           ▼              │ │
│  │         Permission   tool.call()  Hooks            │ │
│  │         Check                     Pre/Post         │ │
│  │              │           │           │              │ │
│  │              └───────────┼───────────┘              │ │
│  │                          │                          │ │
│  │                   ┌──────▼───────┐                  │ │
│  │                   │ Tool Results │                  │ │
│  │                   └──────┬───────┘                  │ │
│  │                          │                          │ │
│  │                   ┌──────▼───────┐                  │ │
│  │                   │ 错误恢复？    │                  │ │
│  │                   │ Stop Hook？  │                  │ │
│  │                   │ Budget？     │                  │ │
│  │                   └──────┬───────┘                  │ │
│  │                          │                          │ │
│  │                   Continue or Terminal               │ │
│  │                                                     │ │
│  │  }  // end while                                    │ │
│  └─────────────────────────────────────────────────────┘ │
│                                                          │
│  return Terminal                                         │
└──────────────────────────────────────────────────────────┘
  │
  ▼
Final Response → UI / SDK

9. query/config.ts —— 查询配置构建

每次 API 调用前需要组装完整的请求配置：

系统提示词 = base system prompt
  + CLAUDE.md 内容（user/project/local/managed）
  + Git 状态
  + 当前日期
  + 可用工具列表
  + MCP 服务器指令
  + 自定义追加内容（appendSystemPrompt）

消息 = 用户消息历史
  + 工具结果
  + 附件（记忆、Skill、文件变更）
  + user context（前置注入）

工具 = getAllBaseTools()
  + MCP 工具
  + Skill 工具
  + 条件工具（feature gates）
  - 被 deny 规则禁用的工具
  - 延迟发现的工具（只发名称）

Agent 工程实践 Takeaway

AsyncGenerator 是 Agent 循环的最佳范式

不要用 callback 或 Promise。AsyncGenerator 让循环可以：

流式输出中间结果（yield）
返回最终状态（return）
被中断（通过 AbortController）
可组合（yield* 委托给子生成器）

显式状态转换 > 隐式 continue

每次循环继续时记录 transition.reason，好处：

代码可读性：每个 continue 都有语义
遥测精度：可以统计每种 recovery 的频率
调试效率：出问题时可以追踪转换链

流式执行 = Agent 延迟优化的核心

传统：API → 等待完成 → 执行工具 → 等待完成 → API
流式：API 边返回 → 边执行工具 → 工具完成 → API 立即开始

这是 Claude Code 感觉「快」的核心原因之一。

多层错误恢复而非单点故障

413 → collapse drain → reactive compact → surface error
max_tokens → escalate → multi-turn recovery
model error → fallback model → retry

每层恢复都是低成本的尝试，只在前一层失败时才触发下一层。

与其他层的交互

→ Tool 系统（L2）：StreamingToolExecutor 是循环的核心组件，处理 tool_use blocks
→ 权限系统（L3）：权限检查在 runToolUse 中执行，可能导致用户交互或自动决策
→ Hook 系统（L4）：Stop hooks 可以让循环继续；SessionStart hooks 在压缩后触发
→ 上下文管理（L5）：auto-compact 和 token budget 是循环的退出/恢复条件
→ Sub-agent（L6）：子 Agent 运行自己独立的 query 循环，共享部分上下文