Layer 4: Hook & MCP 扩展体系 —— 可扩展的 Agent 架构

核心问题

一个 Agent 框架如何在保持核心简洁的同时，允许用户和外部系统注入自定义行为？

Claude Code 提供了两套正交的扩展机制：

Hook 系统：事件驱动的行为注入（类似 Git Hooks / Webpack Plugins）
MCP 系统：标准化的工具/资源发现协议（类似 LSP 之于 IDE）

Part A: Hook 系统

1. 架构总览

                    ┌─────────────────────┐
                    │   Agent Runtime      │
                    │   (Query Engine)     │
                    └──────┬──────────────┘
                           │ 触发 HookEvent
                    ┌──────▼──────────────┐
                    │   Hook Dispatcher    │
                    │   hooksSettings.ts   │
                    └──────┬──────────────┘
                           │
              ┌────────────┼────────────────┐
              ▼            ▼                ▼
       ┌──────────┐ ┌──────────┐    ┌──────────┐
       │ Command  │ │  Query   │    │   HTTP   │
       │  Hook    │ │  Hook    │    │  Hook    │
       │ (shell)  │ │ (Claude) │    │ (webhook)│
       └──────────┘ └──────────┘    └──────────┘
              │            │                │
              ▼            ▼                ▼
       exit code 0    JSON output      HTTP response
       exit code 2 → BLOCK 操作
       其他 → 警告

2. 25 种 Hook 事件

Hook 事件覆盖了 Agent 生命周期的每个关键节点：

// hooksConfigManager.ts:27-266 — getHookEventMetadata()

类别	事件	触发时机	可阻塞？
工具执行	`PreToolUse`	工具执行前	✅ exit 2
	`PostToolUse`	工具执行后	✅ exit 2
	`PostToolUseFailure`	工具执行失败	❌
权限	`PermissionRequest`	权限对话框弹出时	✅ 可返回 allow/deny
	`PermissionDenied`	分类器拒绝后	可请求重试
用户输入	`UserPromptSubmit`	用户提交提示词	✅ exit 2
	`Notification`	发送通知	❌
会话	`SessionStart`	新会话开始	❌
	`SessionEnd`	会话结束	❌
	`Stop`	Claude 即将结束回复	✅ exit 2 继续对话
	`StopFailure`	API 错误导致终止	❌
Agent	`SubagentStart`	子 Agent 启动	❌
	`SubagentStop`	子 Agent 即将结束	✅ exit 2 继续
	`TeammateIdle`	队友即将空闲	✅ exit 2 防止空闲
任务	`TaskCreated`	任务创建	✅ exit 2 阻止
	`TaskCompleted`	任务完成	✅ exit 2 阻止
压缩	`PreCompact`	对话压缩前	✅ exit 2 阻止
	`PostCompact`	对话压缩后	❌
MCP	`Elicitation`	MCP 请求用户输入	✅ exit 2 拒绝
	`ElicitationResult`	用户回复 MCP	✅ 可修改回复
配置	`ConfigChange`	配置文件变更	✅ exit 2 阻止
	`InstructionsLoaded`	指令文件加载	❌ 仅观测
	`CwdChanged`	工作目录变更	❌
	`FileChanged`	监控文件变更	❌
VCS	`WorktreeCreate`	创建工作树	返回路径
	`WorktreeRemove`	删除工作树	❌

3. Hook 的三种执行类型

// Hook 配置结构
type HookConfig = {
  event: HookEvent             // 25 种事件之一
  matcher?: string             // 可选的字段匹配器（如工具名、Agent 类型）
  config: {
    type: 'command' | 'query' | 'http'
    command?: string           // shell 命令
    timeout?: number           // 超时时间
  }
  source: string               // 来源（userSettings, pluginHook 等）
}

Command Hook（最常用）：

{
  "hooks": [{
    "event": "PreToolUse",
    "matcher": "Bash",
    "config": { "type": "command", "command": "python validate_command.py" }
  }]
}

通过 stdin 接收 JSON 输入
exit code 决定行为：0=通过, 2=阻塞, 其他=警告
可以通过 stdout 返回 JSON 注入额外上下文

Query Hook：

调用 Claude API 做决策
适合需要 AI 判断的场景

HTTP Hook：

发送 webhook 到外部服务
适合企业级审计和合规场景

4. Hook 执行流程（Pre/Post Tool 为例）

Agent 想执行 BashTool("git push")
  │
  ├── 1. runPreToolUseHooks()
  │     ├── 查找匹配的 hooks（event=PreToolUse, matcher=Bash）
  │     ├── 按优先级排序（sortMatchersByPriority）
  │     ├── 逐个执行 hook command
  │     │     ├── stdin: { tool_name, tool_input, tool_use_id }
  │     │     ├── exit 0 → 通过，可能有 additionalContext
  │     │     ├── exit 2 → BLOCK，stderr 作为阻塞消息
  │     │     └── 其他  → 警告，继续执行
  │     └── 如果任何 hook 阻塞 → 返回 blocking_error
  │
  ├── 2. 权限检查 (hasPermissionsToUseTool)
  │
  ├── 3. 工具执行 (tool.call())
  │
  └── 4. runPostToolUseHooks()
        ├── stdin: { inputs, response }
        ├── 可以修改 MCP 工具输出 (updatedMCPToolOutput)
        ├── 可以注入额外上下文 (additionalContexts)
        └── 可以阻止后续继续 (preventContinuation)

★ 设计洞察：Hook 系统的一个精妙之处是 PreToolUse hook 可以修改工具输入（updatedInput），而 PostToolUse hook 可以修改 MCP 工具输出（updatedMCPToolOutput）。这意味着 Hook 不仅仅是观测点，更是一个完整的请求/响应拦截器模式。

5. Matcher 机制

每种 Hook 事件都可以通过 matcher 字段过滤：

// hooksConfigManager.ts:29-35
PreToolUse: {
  matcherMetadata: {
    fieldToMatch: 'tool_name',
    values: toolNames,  // 所有可用工具名
  },
}

PreToolUse / PostToolUse：按 tool_name 匹配
SubagentStart / SubagentStop：按 agent_type 匹配
SessionStart：按 source 匹配（startup/resume/clear/compact）
Notification：按 notification_type 匹配
ConfigChange：按配置来源匹配

6. Hook 与权限系统的深度集成

最关键的交互是 PermissionRequest Hook：

// hooksConfigManager.ts:163-171
PermissionRequest: {
  summary: 'When a permission dialog is displayed',
  description: 'Output JSON with hookSpecificOutput containing decision to allow or deny.',
}

这让企业用户可以实现自动化权限审批管线：

{
  "hooks": [{
    "event": "PermissionRequest",
    "matcher": "Bash",
    "config": { "type": "command", "command": "python auto_approve_policy.py" }
  }]
}

auto_approve_policy.py 可以根据企业安全策略自动返回 allow/deny 决策，无需人工干预。

Part B: MCP (Model Context Protocol) 系统

1. 架构总览

           ┌──────────────────────────┐
           │     Claude Code CLI       │
           │                          │
           │  ┌────────────────────┐  │
           │  │  MCP Client Pool   │  │
           │  │  (memoized cache)  │  │
           │  └─────┬──────────────┘  │
           └────────┼─────────────────┘
                    │
       ┌────────────┼────────────────────────┐
       ▼            ▼            ▼           ▼
   ┌────────┐  ┌────────┐  ┌────────┐  ┌────────┐
   │ stdio  │  │  HTTP  │  │  SSE   │  │   WS   │
   │ 本地进程│  │ 流式HTTP│  │ 事件流 │  │WebSocket│
   └────────┘  └────────┘  └────────┘  └────────┘
       │            │            │           │
       ▼            ▼            ▼           ▼
   ┌────────────────────────────────────────────┐
   │           MCP Server (外部)                 │
   │  tools/list → 工具发现                      │
   │  tools/call → 工具调用                      │
   │  resources/list → 资源列表                  │
   │  resources/read → 资源读取                  │
   │  prompts/list → 提示词模板                  │
   └────────────────────────────────────────────┘

2. 传输协议支持

Claude Code 支持 8 种 MCP 传输协议：

类型	场景	实现
`stdio`	本地命令行工具	`StdioClientTransport`
`http`	远程服务（推荐）	`StreamableHTTPClientTransport`
`sse`	远程服务（旧版）	`SSEClientTransport`
`ws`	WebSocket 连接	自定义 `WebSocketTransport`
`sse-ide`	IDE 扩展	SSE + 认证
`ws-ide`	IDE 扩展	WS + 认证
`sdk`	进程内 SDK	`SdkControlClientTransport`
`claudeai-proxy`	claude.ai 代理	代理传输

★ 设计洞察：支持 8 种传输协议体现了 MCP 的「连接万物」哲学。stdio 适合本地开发工具，HTTP/SSE/WS 适合远程服务，SDK 适合深度集成。特别是 ws 支持 TLS/mTLS（双向证书认证），表明这个系统面向的是企业级安全场景。

3. 连接生命周期管理

// client.ts:595+ — connectToServer()（memoized）
export const connectToServer = memoize(async (name, serverRef) => {
  // 1. 检查 auth cache（15 分钟 TTL）
  if (await isMcpAuthCached(serverId)) {
    return { type: 'needs-auth', ... }  // 跳过已知需要认证的服务器
  }

  // 2. 检测传输类型 → 创建 Transport 实例
  let transport: Transport
  if (serverRef.type === 'sse') {
    transport = new SSEClientTransport(url, { authProvider, ... })
  } else if (serverRef.type === 'ws') {
    transport = new WebSocketTransport(wsClient)
  } else if (serverRef.type === 'http') {
    transport = new StreamableHTTPClientTransport(url, { authProvider, ... })
  } else {  // stdio (默认)
    transport = new StdioClientTransport({ command, args, env })
  }

  // 3. 创建 MCP Client + 连接（带超时保护）
  const client = new Client({ name: 'claude-code', ... })
  await Promise.race([client.connect(transport), timeoutPromise])

  // 4. 返回连接状态
  return { type: 'connected', client, capabilities: client.capabilities }
})

关键工程决策：

认证缓存（client.ts:257）：

const MCP_AUTH_CACHE_TTL_MS = 15 * 60 * 1000 // 15 分钟

如果一个 MCP 服务器返回 401，缓存这个状态 15 分钟
避免每次启动都对已知需要认证的服务器发起连接探测
缓存写入通过 Promise chain 序列化，防止并发竞争

连接超时保护：

// Promise.race 模式
await Promise.race([connectPromise, timeoutPromise])

会话过期处理：

// client.ts:193-206
function isMcpSessionExpiredError(error: Error): boolean {
  // HTTP 404 + JSON-RPC -32001 = 会话已过期
  return httpStatus === 404 && error.message.includes('"code":-32001')
}

4. 工具发现与包装

MCP 服务器的工具通过 tools/list RPC 发现，然后被包装为标准的 Claude Code Tool 对象：

// client.ts:1743+ — fetchToolsForClient()（memoized LRU）
export const fetchToolsForClient = memoizeWithLRU(async (client) => {
  const result = await client.client.request(
    { method: 'tools/list' },
    ListToolsResultSchema,
  )

  return result.tools.map(tool => ({
    ...MCPTool,  // 展开基础 MCPTool 定义
    name: `mcp__${serverName}__${toolName}`,  // 完全限定名
    mcpInfo: { serverName, toolName },         // 原始名称信息
    isMcp: true,

    // 关键方法覆盖
    async call(args, context) {
      const connectedClient = await ensureConnectedClient(client)
      return callMCPTool(connectedClient, tool.name, args)
    },

    // 利用 MCP annotations 标记安全性
    isConcurrencySafe() { return tool.annotations?.readOnlyHint ?? false },
    isReadOnly() { return tool.annotations?.readOnlyHint ?? false },
    isDestructive() { return tool.annotations?.destructiveHint ?? false },

    // 输入 schema 直接传递（JSON Schema，非 Zod）
    inputJSONSchema: tool.inputSchema,
  }))
}, serverName => serverName, 20)  // LRU 缓存 20 个服务器

命名约定：

内置工具: Bash, FileRead, Grep, ...
MCP 工具: mcp__server-name__tool-name

这个命名约定让权限系统可以按层级匹配：

mcp__server1 → 匹配该服务器所有工具
mcp__server1__* → 通配符匹配
mcp__server1__specific-tool → 精确匹配

5. Elicitation（用户交互请求）

MCP 服务器可以在工具执行过程中请求用户输入（如 OAuth 授权确认）：

MCP tool.call()
  → 服务器返回 ElicitRequest (JSON-RPC -32042)
    → 触发 Elicitation Hook
      → 用户在终端看到交互对话框
        → 用户回复
          → 触发 ElicitationResult Hook（可修改回复）
            → 回传给 MCP 服务器
              → 工具继续执行

★ 设计洞察：Elicitation 是 MCP 协议中最复杂的部分之一。它允许 MCP 工具在执行过程中「暂停」并请求额外信息，类似于 OAuth 中的步进认证（step-up auth）。两个相关 Hook（Elicitation + ElicitationResult）允许企业用户自动化处理这类请求。

6. 并发连接管理

// client.ts — getMcpToolsCommandsAndResources()
// 区分本地和远程服务器的并发度
const localConcurrency = 2    // stdio 本地进程
const remoteConcurrency = 10  // http/sse/ws 远程服务

await pMap(localServers, connectAndFetch, { concurrency: localConcurrency })
await pMap(remoteServers, connectAndFetch, { concurrency: remoteConcurrency })

★ 设计洞察：本地 stdio 进程的并发度限制为 2，因为每个 stdio 连接都会 fork 一个子进程，过多的子进程会影响系统性能。远程 HTTP/WS 连接可以更高的并发度，因为它们是轻量级的网络连接。

Part C: Hook + MCP 的协同模式

两套扩展机制不是独立的，而是深度协同的：

1. MCP 工具的 Hook 拦截

PreToolUse hook (matcher: "mcp__github__create_issue")
  → 可以拦截特定 MCP 工具的调用
  → 可以修改输入参数
  → 可以阻止调用

PostToolUse hook (matcher: "mcp__github__*")
  → 可以修改 MCP 工具的输出 (updatedMCPToolOutput)
  → 可以注入额外上下文

2. Elicitation 的 Hook 自动化

MCP 服务器请求 OAuth 授权
  → Elicitation Hook 自动检查令牌库
    → 有有效令牌 → 自动回复 accept
    → 无令牌 → 传递给用户

3. PermissionRequest 对 MCP 工具的控制

MCP 工具 mcp__deploy__push_to_prod
  → 权限检查触发 PermissionRequest hook
    → Hook 脚本检查当前是否在发布窗口
      → 在窗口内 → allow
      → 不在 → deny + "Not in release window"

Agent 工程实践 Takeaway

事件驱动 + 拦截器 = Agent 可扩展性的最佳组合

Hook 系统负责行为层面的扩展（拦截、修改、观测）
MCP 系统负责能力层面的扩展（新工具、新资源）
两者正交组合可以覆盖几乎所有自定义需求

Exit Code 协议是轻量级的跨语言通信

不需要复杂的 RPC 框架，exit code 0/2/其他就够了：

0 = 成功，可选的 stdout 作为数据
2 = 阻塞操作
其他 = 警告但不阻塞

MCP 工具应该声明自己的安全属性

通过 annotations（readOnlyHint, destructiveHint, openWorldHint），MCP 工具可以告诉 Agent 框架自己的安全特征，让权限系统做出更智能的决策。

连接池 + 认证缓存是 MCP 客户端的必备

memoize 连接避免重复建立
认证失败缓存 15 分钟避免无谓的探测
LRU 缓存工具列表避免重复发现

与其他层的交互

→ 权限系统（L3）：PermissionRequest Hook 可以自动化权限决策
→ 工具系统（L2）：MCP 工具被包装为标准 Tool 接口，Pre/PostToolUse Hook 拦截工具执行
→ 上下文管理（L5）：PreCompact/PostCompact Hook 控制对话压缩行为
→ Agent Loop（L1）：Stop Hook 可以让对话继续，SessionStart/End 管理生命周期
→ Sub-agent（L6）：SubagentStart/Stop/TeammateIdle Hook 控制多 Agent 协作