Skip to content

第十一章:实战 -- 从零构建你的 Agent

"理论是灰色的,而生命之树常青。"


11.1 目标

在这一章,我们将基于 packages/agent 构建一个独立的、最小化的 Agent。它不需要 packages/coding-agent 的完整功能——只使用最底层的 Agent Core,加上你自己的工具。

最终目标:一个能读取文件、执行命令、回答问题的 Agent。

★ Insight ─────────────────────────────────────为什么不用 packages/coding-agent 因为 coding-agent 包含了太多你可能不需要的东西——TUI、会话持久化、扩展系统、compaction。如果你想构建一个嵌入到其他应用中的 Agent(比如 Slack Bot、Web API、CI/CD 管道),直接使用 packages/agent 会更轻量、更灵活。这就像在 Java 中选择用 Spring Core 而不是 Spring Boot。 ─────────────────────────────────────────────────

11.2 项目结构

my-agent/
├── package.json
├── tsconfig.json
├── src/
│   ├── index.ts          # 主入口
│   ├── tools/
│   │   ├── read.ts       # Read 工具
│   │   ├── write.ts      # Write 工具
│   │   └── bash.ts       # Bash 工具
│   └── agent.ts          # Agent 配置和启动
└── README.md

11.3 Step 1: 项目初始化

json
// package.json
{
    "name": "my-agent",
    "version": "1.0.0",
    "type": "module",
    "scripts": {
        "start": "tsx src/index.ts"
    },
    "dependencies": {
        "@mariozechner/pi-agent-core": "*",
        "@mariozechner/pi-ai": "*",
        "typebox": "*"
    },
    "devDependencies": {
        "tsx": "*",
        "typescript": "*"
    }
}
json
// tsconfig.json
{
    "compilerOptions": {
        "target": "ES2022",
        "module": "Node16",
        "moduleResolution": "Node16",
        "strict": true,
        "outDir": "dist"
    },
    "include": ["src"]
}

11.4 Step 2: 定义工具

Read 工具

typescript
// src/tools/read.ts
import { Type, type Static } from "typebox";
import * as fs from "fs/promises";
import * as path from "path";
import type { AgentTool, AgentToolResult } from "@mariozechner/pi-agent-core";

const readSchema = Type.Object({
    path: Type.String({ description: "Path to the file to read" }),
});

type ReadInput = Static<typeof readSchema>;

export function createReadTool(cwd: string): AgentTool<typeof readSchema> {
    return {
        name: "read",
        label: "Read",
        description: "Read a file from disk and return its contents",
        parameters: readSchema,

        async execute(toolCallId: string, params: ReadInput): Promise&lt;AgentToolResult<undefined&gt;> {
            const absolutePath = path.resolve(cwd, params.path);
            const content = await fs.readFile(absolutePath, "utf-8");

            return {
                content: [{ type: "text", text: content }],
                details: undefined,
            };
        },
    };
}

Write 工具

typescript
// src/tools/write.ts
import { Type, type Static } from "typebox";
import * as fs from "fs/promises";
import * as path from "path";
import type { AgentTool, AgentToolResult } from "@mariozechner/pi-agent-core";

const writeSchema = Type.Object({
    path: Type.String({ description: "Path to the file to write" }),
    content: Type.String({ description: "Content to write to the file" }),
});

type WriteInput = Static<typeof writeSchema>;

export function createWriteTool(cwd: string): AgentTool<typeof writeSchema> {
    return {
        name: "write",
        label: "Write",
        description: "Write content to a file, creating directories if needed",
        parameters: writeSchema,

        async execute(toolCallId: string, params: WriteInput): Promise&lt;AgentToolResult<undefined&gt;> {
            const absolutePath = path.resolve(cwd, params.path);
            await fs.mkdir(path.dirname(absolutePath), { recursive: true });
            await fs.writeFile(absolutePath, params.content, "utf-8");

            return {
                content: [{ type: "text", text: `File written: ${absolutePath}` }],
                details: undefined,
            };
        },
    };
}

Bash 工具

typescript
// src/tools/bash.ts
import { Type, type Static } from "typebox";
import { exec } from "child_process";
import { promisify } from "util";
import type { AgentTool, AgentToolResult } from "@mariozechner/pi-agent-core";

const execAsync = promisify(exec);

const bashSchema = Type.Object({
    command: Type.String({ description: "Shell command to execute" }),
});

type BashInput = Static<typeof bashSchema>;

export function createBashTool(cwd: string): AgentTool<typeof bashSchema> {
    return {
        name: "bash",
        label: "Bash",
        description: "Execute a shell command and return stdout/stderr",
        parameters: bashSchema,

        async execute(toolCallId: string, params: BashInput): Promise&lt;AgentToolResult<undefined&gt;> {
            try {
                const { stdout, stderr } = await execAsync(params.command, {
                    cwd,
                    timeout: 30000,  // 30 秒超时
                    maxBuffer: 1024 * 1024,  // 1MB 最大输出
                });

                const output = [stdout, stderr].filter(Boolean).join("\n");
                return {
                    content: [{ type: "text", text: output || "(no output)" }],
                    details: undefined,
                };
            } catch (error: any) {
                return {
                    content: [{
                        type: "text",
                        text: `Error: ${error.message}\n${error.stderr || ""}`,
                    }],
                    details: undefined,
                };
            }
        },
    };
}

11.5 Step 3: 组装 Agent

typescript
// src/agent.ts
import { Agent } from "@mariozechner/pi-agent-core";
import { streamSimple } from "@mariozechner/pi-ai";
import { createReadTool } from "./tools/read.js";
import { createWriteTool } from "./tools/write.js";
import { createBashTool } from "./tools/bash.js";

export function createMyAgent(options: {
    apiKey: string;
    model?: string;
    cwd?: string;
}): Agent {
    const cwd = options.cwd ?? process.cwd();

    const agent = new Agent({
        streamFn: streamSimple,
        toolExecution: "parallel",
    });

    // 设置状态
    agent.state.systemPrompt = `You are a helpful coding assistant.
You can read files, write files, and execute shell commands.
Always explain what you're doing before using tools.
When reading code files, add line numbers to your references.`;

    agent.state.model = {
        id: options.model ?? "claude-sonnet-4-20250514",
        name: "Claude Sonnet 4",
        api: "anthropic-messages",
        provider: "anthropic",
        baseUrl: "https://api.anthropic.com",
        reasoning: false,
        input: ["text"],
        cost: { input: 3, output: 15, cacheRead: 0.3, cacheWrite: 3.75 },
        contextWindow: 200000,
        maxTokens: 8192,
    };

    // 注册工具
    agent.state.tools = [
        createReadTool(cwd),
        createWriteTool(cwd),
        createBashTool(cwd),
    ];

    return agent;
}

11.6 Step 4: 主入口

typescript
// src/index.ts
import * as readline from "readline";
import { createMyAgent } from "./agent.js";

async function main() {
    const apiKey = process.env.ANTHROPIC_API_KEY;
    if (!apiKey) {
        console.error("Please set ANTHROPIC_API_KEY environment variable");
        process.exit(1);
    }

    const agent = createMyAgent({
        apiKey,
        cwd: process.cwd(),
    });

    // 订阅事件,打印输出
    agent.subscribe(async (event) => {
        switch (event.type) {
            case "message_update":
                // 流式输出 LLM 的回复
                if (event.assistantMessageEvent.type === "text_delta") {
                    process.stdout.write(event.assistantMessageEvent.delta);
                }
                break;

            case "tool_execution_start":
                console.log(`\n[Tool] ${event.toolName}(${JSON.stringify(event.args)})`);
                break;

            case "tool_execution_end":
                if (event.isError) {
                    console.log(`[Tool Error] ${event.toolName}`);
                }
                break;

            case "agent_end":
                console.log("\n");  // 换行
                break;
        }
    });

    // 交互式循环
    const rl = readline.createInterface({
        input: process.stdin,
        output: process.stdout,
    });

    console.log("My Agent v1.0 - Type 'quit' to exit\n");

    const ask = () => {
        rl.question("You: ", async (input) => {
            if (input.trim().toLowerCase() === "quit") {
                rl.close();
                return;
            }

            try {
                await agent.prompt(input);
            } catch (error) {
                console.error("Error:", error);
            }

            ask();
        });
    };

    ask();
}

main().catch(console.error);

11.7 Step 5: 运行

bash
cd my-agent
npm install
ANTHROPIC_API_KEY=your-key npm start

交互示例

My Agent v1.0 - Type 'quit' to exit

You: 读一下 package.json
[Tool] read({"path":"package.json"})
{
  "name": "my-agent",
  "version": "1.0.0",
  ...
}

This is your project's package.json. It defines a Node.js project called "my-agent"
with dependencies on the Pi agent core and AI packages...

You: 列出 src 目录下的文件
[Tool] bash({"command":"ls -la src/"})
total 24
drwxr-xr-x  6 user  staff  192 Jan  1 00:00 .
drwxr-xr-x  4 user  staff  128 Jan  1 00:00 ..
-rw-r--r--  1 user  staff  123 Jan  1 00:00 agent.ts
-rw-r--r--  1 user  staff  456 Jan  1 00:00 index.ts
drwxr-xr-x  3 user  staff   96 Jan  1 00:00 tools

The src directory contains three items: agent.ts (agent configuration),
index.ts (main entry point), and a tools/ directory...

11.8 进阶:添加更多能力

添加 Steering 支持

typescript
// 让用户在 Agent 工作时也能输入
process.stdin.on("data", (data) => {
    const text = data.toString().trim();
    if (text) {
        agent.steer({
            role: "user",
            content: [{ type: "text", text }],
            timestamp: Date.now(),
        });
    }
});

添加上下文管理

typescript
// 当上下文太大时自动压缩
agent.transformContext = async (messages) => {
    const totalChars = messages.reduce((sum, m) => {
        const content = typeof m.content === "string" ? m.content : JSON.stringify(m.content);
        return sum + content.length;
    }, 0);

    if (totalChars > 500000) {  // 约 125K token
        // 保留 system prompt + 最近 10 条消息
        return messages.slice(-10);
    }
    return messages;
};

添加 beforeToolCall 安全检查

typescript
agent.beforeToolCall = async (context) => {
    // 阻止危险的 bash 命令
    if (context.toolCall.name === "bash") {
        const cmd = context.toolCall.arguments.command;
        if (cmd.includes("rm -rf") || cmd.includes("sudo")) {
            return {
                block: true,
                reason: "This command is not allowed for safety reasons.",
            };
        }
    }
    return undefined;  // 允许执行
};

11.9 架构对比

你的自定义 Agent vs Pi 的完整 Coding Agent:

你的 Agent:                    Pi 的 Coding Agent:
┌──────────────────┐          ┌──────────────────────────┐
│    index.ts      │          │  cli.ts + main.ts        │
│  (交互循环)       │          │  (CLI 参数解析)           │
├──────────────────┤          ├──────────────────────────┤
│    agent.ts      │          │  AgentSession            │
│  (Agent 配置)    │          │  (会话管理+模型+扩展)     │
├──────────────────┤          ├──────────────────────────┤
│  3 个工具        │          │  7 个工具 + 扩展工具      │
│  read/write/bash │          │  + compaction + skills    │
├──────────────────┤          ├──────────────────────────┤
│  Agent Core      │          │  Agent Core              │
│  (agent-loop)    │          │  (相同的核心)             │
├──────────────────┤          ├──────────────────────────┤
│  AI Layer        │          │  AI Layer                │
│  (streamSimple)  │          │  (相同的 LLM 抽象)       │
└──────────────────┘          └──────────────────────────┘
   ~300 行代码                    ~100,000 行代码

★ Insight ─────────────────────────────────────核心是一样的。 你的 300 行 Agent 和 Pi 的 100,000 行 Coding Agent 共享同一个 Agent Loop。区别在于上层功能——会话管理、压缩、扩展系统、TUI、RPC 等。你从最小核心开始,按需添加功能,而不是被迫接受全部。这就是分层架构的价值。 ─────────────────────────────────────────────────

11.10 进阶方向

方向需要添加难度
Web APIExpress/Fastify + RPC 模式
Slack BotSlack SDK + 消息适配
CI/CD 集成Print 模式 + 退出码
多 Agent 协作Agent 之间互相发消息
自定义 UITUI 框架或 Web 组件
流式 WebSocketWebSocket 传输层

11.11 本章小结

你已经完成了从零到一的过程

1. 理解 TypeScript (第一章)
2. 理解项目结构 (第二章)
3. 理解 AI 层 (第三章)
4. 理解消息体系 (第四章)
5. 理解类型系统 (第五章)
6. 理解 Agent Core (第六章)
7. 理解工具系统 (第七章)
8. 理解会话管理 (第八章)
9. 理解压缩机制 (第九章)
10. 理解扩展系统 (第十章)
11. 从零构建自己的 Agent (本章)

     你在这里

下一步

  1. 运行你的 Agent,体验它的工作方式
  2. 添加更多工具(搜索、代码分析等)
  3. 阅读 packages/coding-agent/examples/extensions/ 中的 50+ 示例
  4. 尝试用 packages/coding-agentAgentSession 构建更完整的 Agent

第十章:扩展系统 | 目录

基于 MIT 许可证发布