# Agent 功能需求与架构设计文档

> **可执行规格书**：本文档是系统的核心设计。代码修改必须同步更新此文档。
> 如文档与代码冲突，以代码为准，并立即修复文档。

---

## 文档维护规范

**维护原则**：
1. **谁改代码谁更新文档** - 功能变更后，相关文档必须同步修改
2. **保持结构稳定** - 只增删内容，不随意调整层级结构
3. **流程优先** - 新功能先写入核心流程，再补充模块详情
4. **链接代码** - 关键实现标注文件路径，格式：`module/file.py:function_name`
5. **简洁原则** - 只记录最重要的信息，避免大量代码
6. **文档分层** - 每层文档是不同层次的overview，在上层文档对应位置引用下层详细文档

---

## 系统概览

**单次调用是 Agent 的特例**：

| 特性 | 单次调用 | Agent 模式 |
|------|---------|-----------|
| 循环次数 | 1 | N (可配置) |
| 工具调用 | 可选 | 常用 |
| 状态管理 | 无 | 有 (Trace) |
| 记忆检索 | 无 | 有 (Experience/Skill) |
| 执行图 | 1 个节点 | N 个节点的 DAG |

---

## 核心架构

### 三层记忆模型

```
┌─────────────────────────────────────────────────────────────┐
│ Layer 3: Skills（技能库）                                     │
│ - Markdown 文件，存储详细的能力描述                            │
│ - 通过 skill 工具按需加载                                     │
└─────────────────────────────────────────────────────────────┘
                              ▲
                              │ 归纳
┌─────────────────────────────────────────────────────────────┐
│ Layer 2: Experience（经验库）                                 │
│ - 数据库存储，条件 + 规则 + 证据                              │
│ - 向量检索，注入到 system prompt                              │
└─────────────────────────────────────────────────────────────┘
                              ▲
                              │ 提取
┌─────────────────────────────────────────────────────────────┐
│ Layer 1: Task State（任务状态）                               │
│ - 当前任务的工作记忆                                          │
│ - Trace/Step 记录执行过程                                    │
└─────────────────────────────────────────────────────────────┘
```

**注入方式**：
- **Skills**：通过 `skill` 工具动态加载到对话历史
- **Experiences**：检索后注入到 system prompt

---

## 核心流程：Agent Loop

```python
async def run(task: str, max_steps: int = 50):
    # 1. 创建 Trace
    trace = Trace(trace_id=gen_id(), task=task, status="running")
    await trace_store.save(trace)

    # 2. 检索 Experiences，构建 system prompt
    experiences = await search_experiences(task)
    system_prompt = build_system_prompt(experiences)

    # 3. 初始化消息
    messages = [{"role": "user", "content": task}]

    # 4. ReAct 循环
    for step in range(max_steps):
        # 调用 LLM
        response = await llm.chat(
            messages=messages,
            system=system_prompt,
            tools=tool_registry.to_schema()  # 包括 skill 工具
        )

        # 记录 LLM 调用
        await add_step(trace, "llm_call", {
            "response": response.content,
            "tool_calls": response.tool_calls
        })

        # 没有工具调用，完成
        if not response.tool_calls:
            break

        # 执行工具
        for tool_call in response.tool_calls:
            # Doom loop 检测
            if is_doom_loop(tool_call):
                raise DoomLoopError()

            # 执行工具（包括 skill 工具）
            result = await execute_tool(tool_call)

            # 记录步骤
            await add_step(trace, "tool_call", {"tool": tool_call.name, "args": tool_call.args})
            await add_step(trace, "tool_result", {"output": result})

            # 添加到消息历史
            messages.append({"role": "assistant", "tool_calls": [tool_call]})
            messages.append({"role": "tool", "content": result})

    # 5. 完成
    trace.status = "completed"
    await trace_store.save(trace)

    return trace
```

**关键机制**：
- **Doom Loop 检测**：跟踪最近 3 次工具调用，如果都是同一个工具且参数相同，中断循环
- **动态工具加载**：Skill 通过 tool 动态加载，按需消耗 context

---

## 数据模型

### Trace（任务执行）

```python
@dataclass
class Trace:
    trace_id: str
    mode: Literal["call", "agent"]

    # 任务信息
    task: Optional[str] = None
    agent_type: Optional[str] = None

    # 状态
    status: Literal["running", "completed", "failed"] = "running"

    # 上下文（灵活的元数据）
    context: Dict[str, Any] = field(default_factory=dict)

    # 时间
    created_at: datetime
    completed_at: Optional[datetime] = None
```

**实现**：`agent/models/trace.py:Trace`

**context 字段**：存储任务相关的元信息
- `user_id`: 用户 ID
- `project_id`: 项目 ID
- `priority`: 优先级
- `tags`: 标签列表

### Step（执行步骤）

```python
@dataclass
class Step:
    step_id: str
    trace_id: str
    step_type: StepType  # "llm_call", "tool_call", "tool_result", ...

    # DAG 结构
    parent_ids: List[str] = field(default_factory=list)

    # 灵活的步骤数据
    data: Dict[str, Any] = field(default_factory=dict)

    created_at: datetime
```

**实现**：`agent/models/trace.py:Step`

**常见 step_type**：
- `llm_call`: LLM 调用（data: messages, response, tokens, cost）
- `tool_call`: 工具调用（data: tool_name, arguments）
- `tool_result`: 工具结果（data: output, metadata）
- `reasoning`: 推理过程（data: content）

### 执行图示例

```
Trace
  │
  ├─▶ Step(llm_call)
  │     │
  │     ├─▶ Step(tool_call: skill)
  │     │     └─▶ Step(tool_result: "# Error Handling...")
  │     │
  │     └─▶ Step(tool_call: search_logs)
  │           └─▶ Step(tool_result: "...")
  │
  └─▶ Step(llm_call)
        └─▶ ...
```

---

## 模块详情

详细的模块文档请参阅：

### [工具系统](./tools.md)
- 工具定义和注册
- 双层记忆管理
- 域名过滤、敏感数据处理
- 集成 Browser-Use
- 最佳实践

**核心特性**：
```python
from reson_agent import tool, ToolResult, ToolContext

@tool(
    url_patterns=["*.google.com"],
    requires_confirmation=True
)
async def my_tool(arg: str, ctx: ToolContext) -> ToolResult:
    return ToolResult(
        title="Success",
        output="Result content",
        long_term_memory="Short summary"
    )
```

### Skills（技能库）

**存储**：Markdown 文件

```
~/.reson/skills/           # 全局
├── error-handling/SKILL.md
└── data-processing/SKILL.md

./project/.reson/skills/   # 项目级
└── api-integration/SKILL.md
```

**格式**：

```markdown
---
name: error-handling
description: Error handling best practices
---

## When to use
- Analyzing error logs
- Debugging production issues

## Guidelines
- Look for stack traces first
- Check error frequency
- Group by error type
```

**加载**：通过 `skill` 工具动态加载

**实现**：`agent/storage/skill_fs.py:SkillLoader`

### Experiences（经验库）

**存储**：PostgreSQL + pgvector

```sql
CREATE TABLE experiences (
    exp_id TEXT PRIMARY KEY,
    scope TEXT,           -- "agent:executor" 或 "user:123"
    condition TEXT,       -- "当遇到数据库连接超时"
    rule TEXT,            -- "增加重试次数到5次"
    evidence JSONB,       -- 证据（step_ids）

    source TEXT,          -- "execution", "feedback", "manual"
    confidence FLOAT,
    usage_count INT,
    success_rate FLOAT,

    embedding vector(1536),  -- 向量检索

    created_at TIMESTAMP,
    updated_at TIMESTAMP
);
```

**检索和注入**：

```python
# 1. 检索相关 Experiences
experiences = await db.query(
    """
    SELECT condition, rule, success_rate
    FROM experiences
    WHERE scope = $1
    ORDER BY embedding <-> $2
    LIMIT 10
    """,
    f"agent:{agent_type}",
    embed(task)
)

# 2. 注入到 system prompt
system_prompt = base_prompt + "\n\n# Learned Experiences\n" + "\n".join([
    f"- When {e.condition}, then {e.rule} (success rate: {e.success_rate:.1%})"
    for e in experiences
])
```

**实现**：`agent/storage/experience_pg.py:ExperienceStore`

---

## 存储接口

```python
class TraceStore(Protocol):
    async def save(self, trace: Trace) -> None: ...
    async def get(self, trace_id: str) -> Trace: ...
    async def add_step(self, step: Step) -> None: ...
    async def get_steps(self, trace_id: str) -> List[Step]: ...

class ExperienceStore(Protocol):
    async def search(self, scope: str, query: str, limit: int) -> List[Dict]: ...
    async def add(self, exp: Dict) -> None: ...
    async def update_stats(self, exp_id: str, success: bool) -> None: ...

class SkillLoader(Protocol):
    async def scan(self) -> List[str]:  # 返回 skill names
        """扫描并返回所有可用的 skill 名称"""

    async def load(self, name: str) -> str:  # 返回内容
        """加载指定 skill 的 Markdown 内容"""
```

**实现**：`agent/storage/protocols.py`

**实现策略**：
- Trace/Step: 文件系统（JSON）
- Experience: PostgreSQL + pgvector
- Skill: 文件系统（Markdown）

---

## 模块结构

```
agent/
├── __init__.py
├── runner.py              # AgentRunner
├── models/
│   ├── trace.py           # Trace, Step
│   └── memory.py          # Experience, Skill
├── storage/
│   ├── protocols.py       # TraceStore, ExperienceStore, SkillLoader
│   ├── trace_fs.py        # 文件系统实现
│   ├── experience_pg.py   # PostgreSQL 实现
│   └── skill_fs.py        # 文件系统实现
├── tools/
│   ├── registry.py        # ToolRegistry
│   ├── models.py          # ToolResult, ToolContext
│   ├── schema.py          # SchemaGenerator
│   ├── url_matcher.py     # URL 模式匹配
│   └── sensitive.py       # 敏感数据处理
└── llm.py                 # LLMProvider Protocol
```

---

## 设计决策

详见 [设计决策文档](./decisions.md)

**核心决策**：

1. **Skills 通过工具加载**（vs 预先注入）
   - 按需加载，Agent 自主选择
   - 参考 OpenCode 和 Claude API 文档

2. **Skills 用文件系统**（vs 数据库）
   - 易于编辑（Markdown）
   - 版本控制（Git）
   - 零依赖

3. **Experiences 用数据库**（vs 文件）
   - 需要向量检索
   - 需要统计分析
   - 数量大，动态更新

4. **不需要事件系统**
   - 后台场景，不需要实时通知
   - Trace/Step 已记录所有信息

---

## 实现计划

### Phase 1：MVP
- [ ] AgentRunner 基础循环
- [ ] 基础工具（read, skill）
- [ ] 高级工具集成：Browser-Use、Search
- [ ] 单次执行的监控与分析：Trace/Step 数据模型与文件系统存储、初步的执行历史可视化

### Phase 2：反思能力
- [ ] Experience：feedback、归纳反思
- [ ] 批量执行的监控与分析


---

## 核心概念速查

| 概念 | 定义 | 存储 | 实现 |
|------|------|------|------|
| **Trace** | 一次任务执行 | 文件系统（JSON） | `models/trace.py` |
| **Step** | 执行步骤 | 文件系统（JSON） | `models/trace.py` |
| **Skill** | 能力描述（Markdown） | 文件系统 | `storage/skill_fs.py` |
| **Experience** | 经验规则（条件+规则） | 数据库 + 向量 | `storage/experience_pg.py` |
| **Tool** | 可调用的函数 | 内存（注册表） | `tools/registry.py` |
| **Agent Loop** | ReAct 循环 | - | `runner.py` |
| **Doom Loop** | 无限循环检测 | - | `runner.py` |