第四章：消息体系 -- 从 UserMessage 到 ToolResult

"数据结构决定了程序的结构。" -- Rob Pike

4.1 消息是一等公民

在 Pi 的架构中，消息（Message） 是最重要的数据结构。整个系统的运转就是围绕消息的产生、流转和消费展开的。

如果你用过聊天应用，你已经理解了消息模型的基础：用户发一条，AI 回一条。但 Agent 的消息流比聊天复杂得多——它是一个多轮工具调用循环。

4.2 消息类型层次

Message (联合类型)
├── UserMessage          用户输入
├── AssistantMessage     AI 回复（可能包含工具调用）
└── ToolResultMessage    工具执行结果

每种消息都有一个 role 字段作为判别标签，以及一个 timestamp 字段用于持久化排序。

UserMessage -- 用户消息

typescript

interface UserMessage {
    role: "user";
    content: string | (TextContent | ImageContent)[];
    timestamp: number;
}

content 可以是简单的字符串，也可以是富内容数组（支持文本+图片混合）。

★ Insight ─────────────────────────────────────为什么 content 有两种形式？ 简单字符串是为了方便——大多数时候用户只是输入文字。但 LLM 支持多模态输入（文字+图片），所以也需要支持数组形式。这种"便捷形式 + 完整形式"的双轨设计在 API 设计中很常见，类似于 Java 的 List.of(element) vs List.of(e1, e2, e3)。 ─────────────────────────────────────────────────

AssistantMessage -- AI 回复

这是最复杂的消息类型，因为它承载了 AI 的所有输出：

typescript

interface AssistantMessage {
    role: "assistant";
    content: (TextContent | ThinkingContent | ToolCall)[];  // 混合内容
    api: Api;                // 使用的 API 协议
    provider: Provider;      // 使用的提供商
    model: string;           // 使用的模型 ID
    usage: Usage;            // token 使用统计
    stopReason: StopReason;  // 停止原因
    errorMessage?: string;   // 错误信息（如果失败）
    timestamp: number;
}

stopReason 决定了 Agent Loop 的下一步行为：

stopReason	含义	Agent 行为
`"stop"`	正常结束	检查 follow-up 队列，可能继续
`"toolUse"`	请求执行工具	执行工具，将结果喂回 LLM
`"length"`	达到 maxTokens	截断，可能需要重试
`"error"`	API 错误	记录错误，终止
`"aborted"`	用户中断	终止

★ Insight ─────────────────────────────────────content 数组中的元素可以混合出现。 一条 AssistantMessage 可能同时包含文本回复和多个工具调用。例如 LLM 可能说"我来读一下文件"（TextContent），然后调用 read 工具（ToolCall），再说"让我也检查一下目录"（TextContent），再调用 ls 工具（ToolCall）。这种混合内容在流式输出中按顺序出现。 ─────────────────────────────────────────────────

ToolResultMessage -- 工具结果

typescript

interface ToolResultMessage {
    role: "toolResult";
    toolCallId: string;      // 对应的 ToolCall.id
    toolName: string;        // 工具名，如 "read"
    content: (TextContent | ImageContent)[];  // 工具输出
    details?: unknown;       // 结构化详情（给 UI 用）
    isError: boolean;        // 是否执行失败
    timestamp: number;
}

★ Insight ─────────────────────────────────────toolCallId 是连接 ToolCall 和 ToolResult 的纽带。 当 LLM 返回一个 ToolCall（如 id: "call_123", name: "read", arguments: {path: "..."}），Agent 执行工具后生成 ToolResultMessage（如 toolCallId: "call_123", content: [...]），LLM 通过 toolCallId 将结果和调用对应起来。这和 HTTP 请求-响应的关联 ID 是同一个模式。 ─────────────────────────────────────────────────

4.3 一次完整的交互流程

让我们跟踪一个真实的场景：用户说"读一下 package.json"。

时间线                          消息类型
──────────────────────────────────────────────────────────────
t1  用户输入                    UserMessage
    "读一下 package.json"
                                │
t2  LLM 流式响应                AssistantMessage（开始流式）
    "我来读一下这个文件"          ├── TextContent: "我来读一下这个文件"
                                └── ToolCall: { id: "c1", name: "read",
                                     arguments: { path: "package.json" } }
    stopReason: "toolUse"
                                │
t3  Agent 执行 read 工具         ToolResultMessage
                                { toolCallId: "c1", toolName: "read",
                                  content: [TextContent: "{\n  \"name\": ...}"],
                                  details: { truncation: ... },
                                  isError: false }
                                │
t4  LLM 流式响应                AssistantMessage
    "这个文件是项目的根配置..."    ├── TextContent: "这个文件是项目的根配置..."
                                stopReason: "stop"

对应的消息数组：

typescript

const messages: Message[] = [
    // t1
    {
        role: "user",
        content: [{ type: "text", text: "读一下 package.json" }],
        timestamp: 1714500000000
    },
    // t2
    {
        role: "assistant",
        content: [
            { type: "text", text: "我来读一下这个文件" },
            { type: "toolCall", id: "c1", name: "read",
              arguments: { path: "package.json" } }
        ],
        api: "anthropic-messages",
        provider: "anthropic",
        model: "claude-sonnet-4-20250514",
        usage: { input: 150, output: 45, ... },
        stopReason: "toolUse",
        timestamp: 1714500001000
    },
    // t3
    {
        role: "toolResult",
        toolCallId: "c1",
        toolName: "read",
        content: [{ type: "text", text: "{\n  \"name\": \"pi-mono\", ...}" }],
        details: { truncation: { ... } },
        isError: false,
        timestamp: 1714500002000
    },
    // t4
    {
        role: "assistant",
        content: [
            { type: "text", text: "这个文件是项目的根配置，定义了 npm workspaces..." }
        ],
        usage: { input: 300, output: 80, ... },
        stopReason: "stop",
        timestamp: 1714500003000
    }
];

4.4 AgentMessage：扩展的消息类型

在 packages/agent 层，消息被扩展为 AgentMessage：

typescript

// packages/agent/src/types.ts
type AgentMessage = Message | CustomAgentMessages[keyof CustomAgentMessages];

默认情况下 AgentMessage = Message，但应用可以通过 TypeScript 的声明合并（declaration merging） 添加自定义消息类型：

typescript

// 扩展示例
declare module "@mariozechner/pi-agent-core" {
    interface CustomAgentMessages {
        artifact: { role: "artifact"; content: string; type: string };
        notification: { role: "notification"; message: string; level: string };
    }
}

★ Insight ─────────────────────────────────────为什么需要 AgentMessage？ 因为上层应用（如 Web UI、Slack Bot）可能需要在对话流中插入非标准消息——比如渲染一个图表、显示一个文件预览、或者发送一条系统通知。这些消息不会发送给 LLM，但需要在 UI 中展示。convertToLlm 函数负责在调用 LLM 之前过滤掉这些自定义消息。 ─────────────────────────────────────────────────

4.5 使用（Usage）统计

typescript

interface Usage {
    input: number;        // 输入 token 数
    output: number;       // 输出 token 数
    cacheRead: number;    // 缓存读取 token 数
    cacheWrite: number;   // 缓存写入 token 数
    totalTokens: number;  // 总 token 数
    cost: {
        input: number;    // 输入费用 ($)
        output: number;   // 输出费用 ($)
        cacheRead: number;
        cacheWrite: number;
        total: number;    // 总费用 ($)
    };
}

★ Insight ─────────────────────────────────────缓存 token 是降低成本的关键。 Anthropic 和 OpenAI 都支持 prompt caching——如果你的 system prompt 和历史消息没有变化，它们可以被缓存，缓存读取的费用只有正常输入的 10%。Pi 追踪 cacheRead 和 cacheWrite 就是为了让你了解缓存带来的节省。 ─────────────────────────────────────────────────

4.6 AgentEvent：运行时事件

除了消息之外，Pi 还定义了一套事件系统来报告 Agent 的运行状态：

typescript

type AgentEvent =
    // Agent 生命周期
    | { type: "agent_start" }
    | { type: "agent_end"; messages: AgentMessage[] }
    // Turn 生命周期（一个 turn = 一次 LLM 响应 + 工具执行）
    | { type: "turn_start" }
    | { type: "turn_end"; message: AgentMessage; toolResults: ToolResultMessage[] }
    // Message 生命周期
    | { type: "message_start"; message: AgentMessage }
    | { type: "message_update"; message: AgentMessage; assistantMessageEvent: AssistantMessageEvent }
    | { type: "message_end"; message: AgentMessage }
    // Tool 执行生命周期
    | { type: "tool_execution_start"; toolCallId: string; toolName: string; args: any }
    | { type: "tool_execution_update"; toolCallId: string; toolName: string; args: any; partialResult: any }
    | { type: "tool_execution_end"; toolCallId: string; toolName: string; result: any; isError: boolean };

事件层次图：

agent_start
├── turn_start
│   ├── message_start  (user message)
│   ├── message_end    (user message)
│   ├── message_start  (assistant message - 流式开始)
│   ├── message_update (多次 - 流式更新)
│   ├── message_end    (assistant message - 流式结束)
│   ├── tool_execution_start  (tool call 1)
│   ├── tool_execution_update (多次 - 工具进度)
│   ├── tool_execution_end    (tool call 1)
│   └── tool_execution_start  (tool call 2, 可能并行)
│       ...
├── turn_end
├── turn_start          (如果有 follow-up)
│   ...
└── agent_end

⚡ Java 对照 ───────────────────────────────────── 这和 Java 的 HttpServletRequest 生命周期事件完全类似：requestInitialized → attributeAdded → requestDestroyed。或者 Spring 的 ApplicationContext 事件：ContextRefreshedEvent → ContextClosedEvent。 ─────────────────────────────────────────────────

4.7 本章小结

消息是 Agent 的血液，事件是 Agent 的神经信号：

UserMessage ──────────────────────────────────────────────────┐
    │                                                         │
    ▼                                                         │
AssistantMessage ──── (contains ToolCall[])                   │
    │                                                         │
    ├── stopReason: "toolUse"                                 │
    │       │                                                 │
    │       ▼                                                 │
    │   ToolResultMessage ──── (matched by toolCallId)        │
    │       │                                                 │
    │       └──── 回到 LLM，产生新的 AssistantMessage ────────┘
    │
    └── stopReason: "stop"
            │
            ▼
        Agent 结束（或检查 follow-up 队列）

← 第三章：AI 层 | 第五章：类型系统 -- TypeBox 与 Schema 驱动开发 →

第四章：消息体系 -- 从 UserMessage 到 ToolResult ​

4.1 消息是一等公民 ​

4.2 消息类型层次 ​

UserMessage -- 用户消息 ​

AssistantMessage -- AI 回复 ​

ToolResultMessage -- 工具结果 ​

4.3 一次完整的交互流程 ​

4.4 AgentMessage：扩展的消息类型 ​

4.5 使用（Usage）统计 ​

4.6 AgentEvent：运行时事件 ​

4.7 本章小结 ​

第四章：消息体系 -- 从 UserMessage 到 ToolResult

4.1 消息是一等公民

4.2 消息类型层次

UserMessage -- 用户消息

AssistantMessage -- AI 回复

ToolResultMessage -- 工具结果

4.3 一次完整的交互流程

4.4 AgentMessage：扩展的消息类型

4.5 使用（Usage）统计

4.6 AgentEvent：运行时事件

4.7 本章小结