Selaa lähdekoodia

refactor: knowledge structure

Talegorithm 4 päivää sitten
vanhempi
commit
e19fd15694

+ 234 - 0
.refactor-knowledge-complete.md

@@ -0,0 +1,234 @@
+# 知识管理系统重构完成报告
+
+## 重构日期
+2026-03-05
+
+## 核心改动
+
+### 1. 数据结构调整
+
+按照 `agent/docs/knowledge.md` 定义,完成以下字段调整:
+
+**旧结构 → 新结构:**
+- `scenario` → `task`(任务描述)
+- `tags_type` → `types`(支持多选数组)
+- 新增 `tags`(JSON 对象,业务标签)
+- 新增 `scopes`(数组,可见范围)
+- 新增 `owner`(所有者)
+- `source_*` 字段 → `source`(嵌套对象)
+- `eval_*` 和 `metrics_*` 字段 → `eval`(嵌套对象)
+- 保留顶层 `message_id`
+
+**新数据结构示例:**
+```json
+{
+  "id": "knowledge-xxx",
+  "message_id": "msg-xxx",
+  "types": ["strategy", "tool"],
+  "task": "任务描述",
+  "tags": {"category": "preference"},
+  "scopes": ["org:cybertogether"],
+  "owner": "agent:research_agent",
+  "content": "知识内容",
+  "source": {
+    "name": "资源名称",
+    "category": "exp",
+    "urls": ["https://example.com"],
+    "agent_id": "research_agent",
+    "submitted_by": "user@example.com",
+    "timestamp": "2026-03-05T12:00:00Z"
+  },
+  "eval": {
+    "score": 4,
+    "helpful": 5,
+    "harmful": 0,
+    "confidence": 0.9,
+    "helpful_history": [],
+    "harmful_history": []
+  },
+  "created_at": "2026-03-05T12:00:00Z",
+  "updated_at": "2026-03-05T12:00:00Z"
+}
+```
+
+### 2. 数据库迁移
+
+**文件:** `knowhub/server.py`
+
+- 重建 knowledge 表结构
+- 使用 JSON 字段存储 `types`, `tags`, `scopes`, `source`, `eval`
+- 删除旧的扁平化字段(`tags_type`, `scenario`, `source_*`, `eval_*`, `metrics_*`)
+- 备份旧数据库到 `knowhub.db.backup-20260305`
+
+### 3. API 更新
+
+**文件:** `knowhub/server.py`
+
+所有 knowledge API 已更新:
+
+- `POST /api/knowledge` - 保存知识(使用新结构)
+- `GET /api/knowledge/search` - 搜索知识(参数 `tags_type` → `types`)
+- `GET /api/knowledge` - 列出知识(参数 `tags_type` → `types`,新增 `scopes`)
+- `GET /api/knowledge/{id}` - 获取知识(返回新结构)
+- `PUT /api/knowledge/{id}` - 更新知识(使用嵌套 eval 结构)
+- `POST /api/knowledge/batch_update` - 批量更新(使用嵌套 eval 结构)
+- `POST /api/knowledge/slim` - 知识瘦身(使用新结构)
+
+### 4. CLI 工具更新
+
+**文件:** `knowhub/skill/cli.py`
+
+完全重写 CLI 工具以匹配新的数据结构:
+
+```bash
+# 搜索知识
+python -m knowhub.skill.cli search "查询内容" --types strategy
+
+# 保存知识
+python -m knowhub.skill.cli save \
+  --task "任务描述" \
+  --content "知识内容" \
+  --types strategy,tool \
+  --tags '{"category":"preference"}' \
+  --scopes org:cybertogether
+
+# 列出知识
+python -m knowhub.skill.cli list --limit 10 --types strategy
+
+# 更新知识
+python -m knowhub.skill.cli update knowledge-xxx \
+  --score 5 \
+  --helpful-case "有效案例"
+
+# 批量更新
+python -m knowhub.skill.cli batch-update --file feedback.json
+
+# 知识瘦身
+python -m knowhub.skill.cli slim --model google/gemini-2.0-flash-001
+```
+
+### 5. Agent 工具更新
+
+**文件:** `agent/tools/builtin/knowledge.py`
+
+更新所有工具函数:
+
+**knowledge_search:**
+- 参数 `tags_type` → `types`
+- 输出显示 `task` 而不是 `scenario`
+
+**knowledge_save:**
+- 参数 `scenario` → `task`
+- 参数 `tags_type` → `types`
+- 新增参数:`tags`, `scopes`, `owner`, `source_name`, `source_category`, `submitted_by`
+- **重要:** 默认值在 agent 代码中设置(不是服务器端):
+  - `scopes` 默认 `["org:cybertogether"]`
+  - `owner` 默认 `f"agent:{agent_id}"`
+
+**knowledge_list:**
+- 参数 `tags_type` → `types`
+- 新增参数:`scopes`
+
+**knowledge_slim:**
+- 默认模型改为 `google/gemini-2.0-flash-001`
+
+### 6. 清理旧代码
+
+**已删除/备份:**
+- `agent/tools/builtin/experience.py` → `experience.py.old`(旧的经验系统)
+- `agent/tools/builtin/__init__.py` - 删除 `get_experience` 导入和导出
+- `agent/core/runner.py` - 删除 `experiences_path` 参数和 `_load_experiences()` 方法
+- `agent/core/runner.py` - 从 BUILTIN_TOOLS 列表中删除 `get_experience`
+
+### 7. 备份文件
+
+所有修改前的文件都已备份:
+- `knowhub/server.py.old`
+- `knowhub/skill/cli.py.old`
+- `agent/tools/builtin/knowledge.py.old`
+- `agent/tools/builtin/experience.py.old`
+- `knowhub.db.backup-20260305`
+
+## 测试建议
+
+### 1. 启动 KnowHub Server
+
+```bash
+cd knowhub
+python server.py
+```
+
+### 2. 测试 CLI 工具
+
+```bash
+# 保存知识
+python -m knowhub.skill.cli save \
+  --task "测试任务" \
+  --content "测试内容" \
+  --types strategy
+
+# 搜索知识
+python -m knowhub.skill.cli search "测试"
+
+# 列出知识
+python -m knowhub.skill.cli list
+```
+
+### 3. 测试 Agent 工具
+
+在 agent 代码中调用:
+
+```python
+from agent.tools.builtin.knowledge import knowledge_save, knowledge_search
+
+# 保存知识
+await knowledge_save(
+    task="测试任务",
+    content="测试内容",
+    types=["strategy"],
+    agent_id="test_agent"
+)
+
+# 搜索知识
+await knowledge_search(
+    query="测试",
+    types=["strategy"]
+)
+```
+
+## 注意事项
+
+1. **默认值设置位置:** 按照用户要求,默认 org (`scopes`) 和 owner 在 agent 代码中设置,不在服务器端设置。
+
+2. **数据库重建:** 旧数据库已备份,新数据库为空。如需迁移旧数据,需要编写迁移脚本。
+
+3. **完全移除旧系统:** 已删除所有旧的经验系统代码(experience.py, get_experience 等),不保留兼容接口。
+
+4. **环境变量:** 确保设置 `OPEN_ROUTER_API_KEY` 和 `KNOWHUB_API`。
+
+## 下一步
+
+1. 测试所有 API 端点
+2. 如需要,编写数据迁移脚本
+3. 更新相关文档
+
+## 文件清单
+
+**已修改:**
+- `knowhub/server.py` - KnowHub Server(数据库 + API)
+- `knowhub/skill/cli.py` - CLI 工具
+- `agent/tools/builtin/knowledge.py` - Agent 工具集成
+- `agent/tools/builtin/__init__.py` - 删除旧的 experience 导入
+- `agent/core/runner.py` - 删除 experiences_path 和相关代码
+
+**已删除/备份:**
+- `agent/tools/builtin/experience.py` → `experience.py.old`
+
+**已备份:**
+- `knowhub/server.py.old`
+- `knowhub/skill/cli.py.old`
+- `agent/tools/builtin/knowledge.py.old`
+- `knowhub.db.backup-20260305`
+
+**新增:**
+- `.refactor-knowledge-complete.md` - 本文档

+ 3 - 0
.refactor-knowledge.md

@@ -134,6 +134,9 @@
 - [x] 测试 knowledge_batch_update 工具
 - [x] 测试 knowledge_batch_update 工具
 - [x] 修复环境变量加载问题(添加 load_dotenv)
 - [x] 修复环境变量加载问题(添加 load_dotenv)
 - [x] 调整 LLM 模型为 gemini-2.0-flash-001
 - [x] 调整 LLM 模型为 gemini-2.0-flash-001
+- [x] 实现 CLI 工具(knowhub/cli.py)
+- [x] 更新 skill 文档(knowhub/skill/knowhub.md)
+- [x] 添加 CLI 使用文档(knowhub/CLI.md)
 - [ ] 测试 goal focus 自动注入
 - [ ] 测试 goal focus 自动注入
 - [ ] 测试完整流程(保存→检索→注入)
 - [ ] 测试完整流程(保存→检索→注入)
 - [ ] 清理注释代码(可选)
 - [ ] 清理注释代码(可选)

+ 4 - 19
agent/core/runner.py

@@ -104,15 +104,16 @@ BUILTIN_TOOLS = [
 
 
     # 搜索工具
     # 搜索工具
     "search_posts",
     "search_posts",
-    "get_experience",
     "get_search_suggestions",
     "get_search_suggestions",
 
 
     # 知识管理工具
     # 知识管理工具
     "knowledge_search",
     "knowledge_search",
     "knowledge_save",
     "knowledge_save",
     "knowledge_update",
     "knowledge_update",
-    "list_knowledge",
-    
+    "knowledge_batch_update",
+    "knowledge_list",
+    "knowledge_slim",
+
 
 
     # 沙箱工具
     # 沙箱工具
     # "sandbox_create_environment",
     # "sandbox_create_environment",
@@ -198,7 +199,6 @@ class AgentRunner:
         embedding_call: Optional[Callable] = None,
         embedding_call: Optional[Callable] = None,
         config: Optional[AgentConfig] = None,
         config: Optional[AgentConfig] = None,
         skills_dir: Optional[str] = None,
         skills_dir: Optional[str] = None,
-        experiences_path: Optional[str] = "./.cache/experiences.md",
         goal_tree: Optional[GoalTree] = None,
         goal_tree: Optional[GoalTree] = None,
         debug: bool = False,
         debug: bool = False,
     ):
     ):
@@ -215,7 +215,6 @@ class AgentRunner:
             utility_llm_call: 轻量 LLM(用于生成任务标题等),可选
             utility_llm_call: 轻量 LLM(用于生成任务标题等),可选
             config: [向后兼容] AgentConfig
             config: [向后兼容] AgentConfig
             skills_dir: Skills 目录路径
             skills_dir: Skills 目录路径
-            experiences_path: 经验文件路径(默认 ./.cache/experiences.md)
             goal_tree: 初始 GoalTree(可选)
             goal_tree: 初始 GoalTree(可选)
             debug: 保留参数(已废弃)
             debug: 保留参数(已废弃)
         """
         """
@@ -228,8 +227,6 @@ class AgentRunner:
         self.utility_llm_call = utility_llm_call
         self.utility_llm_call = utility_llm_call
         self.config = config or AgentConfig()
         self.config = config or AgentConfig()
         self.skills_dir = skills_dir
         self.skills_dir = skills_dir
-        # 保留 experiences_path 参数以向后兼容,但不再使用(经验已迁移到知识系统)
-        self.experiences_path = experiences_path or "./.cache/experiences.md"
         self.goal_tree = goal_tree
         self.goal_tree = goal_tree
         self.debug = debug
         self.debug = debug
         self._cancel_events: Dict[str, asyncio.Event] = {}  # trace_id → cancel event
         self._cancel_events: Dict[str, asyncio.Event] = {}  # trace_id → cancel event
@@ -2196,15 +2193,3 @@ created_at: {datetime.now().strftime('%Y-%m-%d %H:%M:%S')}
         if not skills:
         if not skills:
             return ""
             return ""
         return "\n\n".join(s.to_prompt_text() for s in skills)
         return "\n\n".join(s.to_prompt_text() for s in skills)
-
-    def _load_experiences(self) -> str:
-        """从文件加载经验(./.cache/experiences.md)"""
-        if not self.experiences_path:
-            return ""
-        try:
-            if os.path.exists(self.experiences_path):
-                with open(self.experiences_path, "r", encoding="utf-8") as f:
-                    return f.read().strip()
-        except Exception as e:
-            logger.warning(f"Failed to load experiences from {self.experiences_path}: {e}")
-        return ""

+ 1 - 6
agent/docs/knowledge.md

@@ -26,12 +26,6 @@
     owner: 所有者(格式:{entity_type}:{entity_id},唯一)
     owner: 所有者(格式:{entity_type}:{entity_id},唯一)
         谁创建的,谁有权修改/删除
         谁创建的,谁有权修改/删除
 
 
-    visibility: 可见性级别(快速过滤标签)
-        private:私有(仅所有者)
-        shared:共享(多个实体)
-        org:组织级
-        public:公开
-
     content:
     content:
         基于类型的具体内容,相对完整的一条知识
         基于类型的具体内容,相对完整的一条知识
 
 
@@ -50,6 +44,7 @@
         confidence: 置信度(0-1)
         confidence: 置信度(0-1)
         helpful_history: [(query+trace_id+outcome), ]用于记录反馈时的调用总结
         helpful_history: [(query+trace_id+outcome), ]用于记录反馈时的调用总结
         harmful_history: []
         harmful_history: []
+
 知识检索机制
 知识检索机制
     检索流程
     检索流程
         1. 构建可见范围
         1. 构建可见范围

+ 3 - 3
agent/tools/builtin/__init__.py

@@ -15,11 +15,10 @@ from agent.tools.builtin.file.grep import grep_content
 from agent.tools.builtin.bash import bash_command
 from agent.tools.builtin.bash import bash_command
 from agent.tools.builtin.skill import skill, list_skills
 from agent.tools.builtin.skill import skill, list_skills
 from agent.tools.builtin.subagent import agent, evaluate
 from agent.tools.builtin.subagent import agent, evaluate
-from agent.tools.builtin.experience import get_experience
 from agent.tools.builtin.search import search_posts, get_search_suggestions
 from agent.tools.builtin.search import search_posts, get_search_suggestions
 from agent.tools.builtin.sandbox import (sandbox_create_environment, sandbox_run_shell,
 from agent.tools.builtin.sandbox import (sandbox_create_environment, sandbox_run_shell,
                                          sandbox_rebuild_with_ports,sandbox_destroy_environment)
                                          sandbox_rebuild_with_ports,sandbox_destroy_environment)
-from agent.tools.builtin.knowledge import(knowledge_search,knowledge_save,knowledge_list,knowledge_update)
+from agent.tools.builtin.knowledge import(knowledge_search,knowledge_save,knowledge_list,knowledge_update,knowledge_batch_update,knowledge_slim)
 from agent.trace.goal_tool import goal
 from agent.trace.goal_tool import goal
 # 导入浏览器工具以触发注册
 # 导入浏览器工具以触发注册
 import agent.tools.builtin.browser  # noqa: F401
 import agent.tools.builtin.browser  # noqa: F401
@@ -36,11 +35,12 @@ __all__ = [
     # 系统工具
     # 系统工具
     "bash_command",
     "bash_command",
     "skill",
     "skill",
-    "get_experience",
     "knowledge_search",
     "knowledge_search",
     "knowledge_save",
     "knowledge_save",
     "knowledge_list",
     "knowledge_list",
     "knowledge_update",
     "knowledge_update",
+    "knowledge_batch_update",
+    "knowledge_slim",
     "list_skills",
     "list_skills",
     "agent",
     "agent",
     "evaluate",
     "evaluate",

+ 0 - 487
agent/tools/builtin/experience.py

@@ -1,487 +0,0 @@
-import logging
-import os
-import yaml
-import json
-import asyncio
-import re
-from typing import List, Optional, Dict, Any
-from datetime import datetime
-from ...llm.openrouter import openrouter_llm_call
-
-logger = logging.getLogger(__name__)
-
-# 默认经验存储路径(当无法从 context 获取时使用)
-DEFAULT_EXPERIENCES_PATH = "./.cache/experiences_restore.md"
-
-def _get_experiences_path(context: Optional[Any] = None) -> str:
-    """
-    从 context 中获取 experiences_path,回退到默认路径。
-
-    context 可能包含 runner 引用,从中读取配置的路径。
-    """
-    if context and isinstance(context, dict):
-        runner = context.get("runner")
-        if runner and hasattr(runner, "experiences_path"):
-            path = runner.experiences_path or DEFAULT_EXPERIENCES_PATH
-            print(f"[Experience] 使用 runner 配置的路径: {runner.experiences_path}")
-            return path
-
-    print(f"[Experience] 使用默认路径: {DEFAULT_EXPERIENCES_PATH}")
-    return DEFAULT_EXPERIENCES_PATH
-
-# ===== 经验进化重写 =====
-async def _evolve_body_with_llm(old_body: str, feedback: str) -> str:
-    """
-    使用检索级别的小模型 (Flash Lite) 执行经验进化重写。
-    """
-    prompt = f"""你是一个 AI Agent 经验库管理员。请根据反馈建议,对现有的 ACE 规范经验进行重写进化。
-
-【原经验内容】:
-{old_body}
-
-【实战反馈建议】:
-{feedback}
-
-【重写要求】:
-1. 保持 ACE 规范:当 [条件/Context] 时,应该 [动作/Action](原因:[逻辑/Reason])。
-2. 融合知识:将反馈中的避坑指南、新参数或修正后的选择逻辑融入原经验,使其更具通用性和准确性。
-3. 语言:简洁直接,使用中文。
-4. 禁止:严禁输出任何开场白、解释语或 Markdown 标题,直接返回重写后的正文。
-"""
-    try:
-        # 调用与检索路由相同的廉价模型
-        response = await openrouter_llm_call(
-            messages=[{"role": "user", "content": prompt}],
-            model="google/gemini-2.0-flash-001" 
-        )
-        
-        evolved_content = response.get("content", "").strip()
-        
-        # 简单安全校验:如果 LLM 返回太短或为空,回退到原内容+追加
-        if len(evolved_content) < 5:
-            raise ValueError("LLM output too short")
-            
-        return evolved_content
-        
-    except Exception as e:
-        logger.warning(f"小模型进化失败,采用追加模式回退: {e}")
-        timestamp = datetime.now().strftime('%Y-%m-%d')
-        return f"{old_body}\n- [Update {timestamp}]: {feedback}"
-    
-# ===== 核心挑选逻辑 =====
-
-async def _route_experiences_by_llm(query_text: str, metadata_list: List[Dict], k: int = 3) -> List[str]:
-    """
-    第一阶段:语义路由。
-    让 LLM 挑选出 2*k 个语义相关的 ID。
-    """
-    if not metadata_list:
-        return []
-
-    # 扩大筛选范围到 2*k
-    routing_k = k * 2
-    
-    routing_data = [
-        {
-            "id": m["id"],
-            "tags": m["tags"],
-            "helpful": m["metrics"]["helpful"]
-        } for m in metadata_list
-    ]
-
-    prompt = f"""
-你是一个经验检索专家。根据用户的当前意图,从下列经验元数据中挑选出最相关的最多 {routing_k} 个经验 ID。
-意图:"{query_text}"
-
-可选经验列表:
-{json.dumps(routing_data, ensure_ascii=False, indent=1)}
-
-请直接输出 ID 列表,用逗号分隔(例如: ex_01, ex_02)。若无相关项请输出 "None"。
-"""
-
-    try:
-        print(f"\n[Step 1: 语义路由] 意图: '{query_text}' | 候选总数: {len(metadata_list)} | 目标提取数: {routing_k}")
-        
-        response = await openrouter_llm_call(
-            messages=[{"role": "user", "content": prompt}],
-            model="google/gemini-2.0-flash-001" 
-        )
-        
-        content = response.get("content", "").strip()
-        selected_ids = [idx.strip() for idx in re.split(r'[,\s]+', content) if idx.strip().startswith("ex_")]
-        
-        print(f"[Step 1: 语义路由] LLM 初选 ID ({len(selected_ids)}个): {selected_ids}")
-        return selected_ids
-    except Exception as e:
-        logger.error(f"LLM 经验路由失败: {e}")
-        return []
-
-async def _get_structured_experiences(query_text: str, top_k: int = 3, context: Optional[Any] = None):
-    """
-    1. 解析物理文件
-    2. 语义路由:提取 2*k 个 ID
-    3. 质量精排:基于 Metrics 筛选出最终的 k 个
-    """
-    print(f"[Experience System]  runner.experiences_path:  {context.get('runner').experiences_path if context and context.get('runner') else None}")
-    experiences_path = _get_experiences_path(context)
-
-    if not os.path.exists(experiences_path):
-        print(f"[Experience System] 警告: 经验文件不存在 ({experiences_path})")
-        return []
-
-    with open(experiences_path, "r", encoding="utf-8") as f:
-        file_content = f.read()
-
-    # --- 阶段 1: 解析 ---
-    # 使用正则表达式匹配 YAML frontmatter 块,避免误分割
-    pattern = r'---\n(.*?)\n---\n(.*?)(?=\n---\n|\Z)'
-    matches = re.findall(pattern, file_content, re.DOTALL)
-
-    content_map = {}
-    metadata_list = []
-
-    for yaml_str, raw_body in matches:
-        try:
-            metadata = yaml.safe_load(yaml_str)
-
-            # 检查 metadata 类型
-            if not isinstance(metadata, dict):
-                logger.error(f"跳过损坏的经验块: metadata 不是 dict,而是 {type(metadata).__name__}")
-                continue
-
-            eid = metadata.get("id")
-            if not eid:
-                logger.error("跳过损坏的经验块: 缺少 id 字段")
-                continue
-
-            meta_item = {
-                "id": eid,
-                "tags": metadata.get("tags", {}),
-                "metrics": metadata.get("metrics", {"helpful": 0, "harmful": 0}),
-            }
-            metadata_list.append(meta_item)
-            content_map[eid] = {
-                "content": raw_body.strip(),
-                "metrics": meta_item["metrics"]
-            }
-        except Exception as e:
-            logger.error(f"跳过损坏的经验块: {e}")
-            continue
-
-    # --- 阶段 2: 语义路由 (取 2*k) ---
-    candidate_ids = await _route_experiences_by_llm(query_text, metadata_list, k=top_k)
-
-    # --- 阶段 3: 质量精排 (根据 Metrics 选出最终的 k) ---
-    print(f"[Step 2: 质量精排] 正在根据 Metrics 对候选经验进行打分...")
-    scored_items = []
-    
-    for eid in candidate_ids:
-        if eid in content_map:
-            item = content_map[eid]
-            metrics = item["metrics"]
-            # 计算综合分:Helpful 是正分,Harmful 是双倍惩罚扣分
-            quality_score = metrics["helpful"] - (metrics["harmful"] * 2.0)
-            
-            # 过滤门槛:如果被标记为严重有害(score < -2),直接丢弃
-            if quality_score < -2:
-                print(f"  - 剔除有害经验: {eid} (Helpful: {metrics['helpful']}, Harmful: {metrics['harmful']})")
-                continue
-                
-            scored_items.append({
-                "id": eid,
-                "content": item["content"],
-                "helpful": metrics["helpful"],
-                "quality_score": quality_score
-            })
-
-    # 按照质量分排序,质量分相同时按 helpful 排序
-    final_sorted = sorted(scored_items, key=lambda x: (x["quality_score"], x["helpful"]), reverse=True)
-    
-    # 截取最终的 top_k
-    result = final_sorted[:top_k]
-    
-    print(f"[Step 2: 质量精排] 最终选定经验: {[it['id'] for it in result]}")
-    print(f"[Experience System] 检索结束。\n")
-    return result
-
-async def _batch_update_experiences(update_map: Dict[str, Dict[str, Any]], context: Optional[Any] = None):
-    """
-    物理层:批量更新经验。
-    修正点:正确使用 new_sections 集合,确保文件结构的完整性与并发进化的同步。
-    """
-    experiences_path = _get_experiences_path(context)
-
-    if not os.path.exists(experiences_path) or not update_map:
-        return 0
-
-    with open(experiences_path, "r", encoding="utf-8") as f:
-        full_content = f.read()
-
-    # 使用正则表达式解析,避免误分割
-    pattern = r'---\n(.*?)\n---\n(.*?)(?=\n---\n|\Z)'
-    matches = re.findall(pattern, full_content, re.DOTALL)
-
-    new_entries = []
-    evolution_tasks = []
-    evolution_registry = {}  # task_idx -> entry_idx
-
-    # --- 第一阶段:处理所有块 ---
-    for yaml_str, body in matches:
-        try:
-            meta = yaml.safe_load(yaml_str)
-            if not isinstance(meta, dict):
-                logger.error(f"跳过损坏的经验块: metadata 不是 dict")
-                continue
-
-            eid = meta.get("id")
-            if not eid:
-                logger.error("跳过损坏的经验块: 缺少 id")
-                continue
-
-            if eid in update_map:
-                instr = update_map[eid]
-                action = instr.get("action")
-                feedback = instr.get("feedback")
-
-                # 处理 mixed 中间态
-                if action == "mixed":
-                    meta["metrics"]["helpful"] += 1
-                    action = "evolve"
-
-                if action == "helpful":
-                    meta["metrics"]["helpful"] += 1
-                elif action == "harmful":
-                    meta["metrics"]["harmful"] += 1
-                elif action == "evolve" and feedback:
-                    # 注册进化任务
-                    task = _evolve_body_with_llm(body.strip(), feedback)
-                    evolution_tasks.append(task)
-                    # 记录该任务对应的 entry 索引
-                    evolution_registry[len(evolution_tasks) - 1] = len(new_entries)
-                    meta["metrics"]["helpful"] += 1
-
-                meta["updated_at"] = datetime.now().strftime('%Y-%m-%d %H:%M:%S')
-
-            # 序列化并加入 new_entries
-            meta_str = yaml.dump(meta, allow_unicode=True).strip()
-            new_entries.append((meta_str, body.strip()))
-
-        except Exception as e:
-            logger.error(f"跳过损坏的经验块: {e}")
-            continue
-
-    # --- 第二阶段:并发进化 ---
-    if evolution_tasks:
-        print(f"🧬 并发处理 {len(evolution_tasks)} 条经验进化...")
-        evolved_results = await asyncio.gather(*evolution_tasks)
-
-        # 精准回填:替换对应 entry 的 body
-        for task_idx, entry_idx in evolution_registry.items():
-            meta_str, _ = new_entries[entry_idx]
-            new_entries[entry_idx] = (meta_str, evolved_results[task_idx].strip())
-
-    # --- 第三阶段:原子化写回 ---
-    final_parts = []
-    for meta_str, body in new_entries:
-        final_parts.append(f"---\n{meta_str}\n---\n{body}\n")
-
-    final_content = "\n".join(final_parts)
-    with open(experiences_path, "w", encoding="utf-8") as f:
-        f.write(final_content)
-
-    return len(update_map)
-
-# ===== 经验库瘦身 =====
-
-async def slim_experiences(model: str = "anthropic/claude-sonnet-4.5", context: Optional[Any] = None) -> str:
-    """
-    经验库瘦身:调用顶级大模型,将经验库中语义相似的经验合并精简。
-    返回瘦身报告字符串。
-    """
-    experiences_path = _get_experiences_path(context)
-
-    if not os.path.exists(experiences_path):
-        return "经验文件不存在,无需瘦身。"
-
-    with open(experiences_path, "r", encoding="utf-8") as f:
-        file_content = f.read()
-
-    # 使用正则表达式解析,避免误分割
-    pattern = r'---\n(.*?)\n---\n(.*?)(?=\n---\n|\Z)'
-    matches = re.findall(pattern, file_content, re.DOTALL)
-
-    parsed = []
-    for yaml_str, body in matches:
-        try:
-            meta = yaml.safe_load(yaml_str)
-            if not isinstance(meta, dict):
-                continue
-            parsed.append({"meta": meta, "body": body.strip()})
-        except Exception:
-            continue
-
-    if len(parsed) < 2:
-        return f"经验库仅有 {len(parsed)} 条,无需瘦身。"
-
-    # 构造发给大模型的内容
-    entries_text = ""
-    for p in parsed:
-        m = p["meta"]
-        entries_text += f"[ID: {m.get('id')}] [Tags: {m.get('tags', {})}] "
-        entries_text += f"[Metrics: {m.get('metrics', {})}]\n"
-        entries_text += f"{p['body']}\n\n"
-
-    prompt = f"""你是一个 AI Agent 经验库管理员。以下是当前经验库的全部条目,请执行瘦身操作:
-
-【任务】:
-1. 识别语义高度相似或重复的经验,将它们合并为一条更精炼、更通用的经验。
-2. 合并时保留 helpful 最高的那条的 ID 和 metrics(metrics 中 helpful/harmful 取各条之和)。
-3. 对于独立的、无重复的经验,保持原样不动。
-4. 保持 ACE 规范格式:当 [条件/Context] 时,应该 [动作/Action](原因:[逻辑/Reason])。
-
-【当前经验库】:
-{entries_text}
-
-【输出格式要求】:
-严格按以下格式输出每条经验,条目之间用 === 分隔:
-ID: <保留的id>
-TAGS: <yaml格式的tags>
-METRICS: <yaml格式的metrics>
-BODY: <合并后的经验正文>
-===
-
-最后一行输出合并报告,格式:
-REPORT: 原有 X 条,合并后 Y 条,精简了 Z 条。
-
-禁止输出任何开场白或解释。"""
-
-    try:
-        print(f"\n[经验瘦身] 正在调用 {model} 分析 {len(parsed)} 条经验...")
-        response = await openrouter_llm_call(
-            messages=[{"role": "user", "content": prompt}],
-            model=model
-        )
-        content = response.get("content", "").strip()
-        if not content:
-            return "大模型返回为空,瘦身失败。"
-
-        # 解析大模型输出,重建经验文件
-        report_line = ""
-        new_entries = []
-        blocks = [b.strip() for b in content.split("===") if b.strip()]
-
-        for block in blocks:
-            if block.startswith("REPORT:"):
-                report_line = block
-                continue
-
-            lines = block.split("\n")
-            eid, tags, metrics, body_lines = None, {}, {}, []
-            current_field = None
-            for line in lines:
-                if line.startswith("ID:"):
-                    eid = line[3:].strip()
-                    current_field = None
-                elif line.startswith("TAGS:"):
-                    try:
-                        tags = yaml.safe_load(line[5:].strip()) or {}
-                    except Exception:
-                        tags = {}
-                    current_field = None
-                elif line.startswith("METRICS:"):
-                    try:
-                        metrics = yaml.safe_load(line[8:].strip()) or {}
-                    except Exception:
-                        metrics = {"helpful": 0, "harmful": 0}
-                    current_field = None
-                elif line.startswith("BODY:"):
-                    body_lines.append(line[5:].strip())
-                    current_field = "body"
-                elif current_field == "body":
-                    body_lines.append(line)
-
-            if eid and body_lines:
-                meta = {
-                    "id": eid,
-                    "tags": tags,
-                    "metrics": metrics,
-                    "updated_at": datetime.now().strftime('%Y-%m-%d %H:%M:%S'),
-                }
-                meta_str = yaml.dump(meta, allow_unicode=True).strip()
-                body_str = "\n".join(body_lines).strip()
-                new_entries.append(f"---\n{meta_str}\n---\n{body_str}\n")
-
-        if not new_entries:
-            return "解析大模型输出失败,经验库未修改。"
-
-        # 写回文件
-        final = "\n".join(new_entries)
-        with open(experiences_path, "w", encoding="utf-8") as f:
-            f.write(final)
-
-        result = f"瘦身完成:{len(parsed)} → {len(new_entries)} 条经验。"
-        if report_line:
-            result += f"\n{report_line}"
-        print(f"[经验瘦身] {result}")
-        return result
-
-    except Exception as e:
-        logger.error(f"经验瘦身失败: {e}")
-        return f"瘦身失败: {e}"
-
-# ===== 对外 Tool 接口 =====
-
-from agent.tools import tool, ToolContext
-
-@tool(description="通过两阶段检索获取最相关的历史经验")
-async def get_experience(query: str, k: int = 3, context: Optional[ToolContext] = None):
-    """
-    通过两阶段检索获取最相关的历史经验。
-    第一阶段语义匹配(2*k),第二阶段质量精排(k)。
-    """
-    relevant_items = await _get_structured_experiences(
-        query_text=query,
-        top_k=k,
-        context=context
-    )
-
-    if not relevant_items:
-        return "未找到足够相关的优质经验。"
-
-    return {
-        "items": relevant_items,
-        "count": len(relevant_items)
-    }
-
-@tool()
-async def update_experiences(feedback_list: List[Dict[str, Any]], context: Optional[ToolContext] = None):
-    """
-    批量反馈历史经验的有效性。
-    
-    Args:
-        feedback_list: 评价列表,每个元素包含:
-            - ex_id: (str) 经验 ID
-            - is_effective: (bool) 是否有效
-            - feedback: (str, optional) 改进建议,若有效且有建议则触发经验进化
-    """
-    if not feedback_list:
-        return "反馈列表为空。"
-
-    # 将 Agent 的输入转换为底层函数需要的映射表格式
-    update_map = {}
-    for item in feedback_list:
-        ex_id = item.get("ex_id")
-        is_effective = item.get("is_effective")
-        comment = item.get("feedback", "")
-
-        action = "helpful" if is_effective else "harmful"
-        if is_effective and comment:
-            action = "evolve"
-        
-        update_map[ex_id] = {
-            "action": action,
-            "feedback": comment
-        }
-
-    count = await _batch_update_experiences(update_map, context)
-    return f"成功同步了 {count} 条经验的反馈。感谢你的评价!"

+ 64 - 52
agent/tools/builtin/knowledge.py

@@ -21,7 +21,7 @@ async def knowledge_search(
     query: str,
     query: str,
     top_k: int = 5,
     top_k: int = 5,
     min_score: int = 3,
     min_score: int = 3,
-    tags_type: Optional[List[str]] = None,
+    types: Optional[List[str]] = None,
     context: Optional[ToolContext] = None,
     context: Optional[ToolContext] = None,
 ) -> ToolResult:
 ) -> ToolResult:
     """
     """
@@ -31,7 +31,7 @@ async def knowledge_search(
         query: 搜索查询(任务描述)
         query: 搜索查询(任务描述)
         top_k: 返回数量(默认 5)
         top_k: 返回数量(默认 5)
         min_score: 最低评分过滤(默认 3)
         min_score: 最低评分过滤(默认 3)
-        tags_type: 按类型过滤(tool/usecase/definition/plan/strategy
+        types: 按类型过滤(user_profile/strategy/tool/usecase/definition/plan)
         context: 工具上下文
         context: 工具上下文
 
 
     Returns:
     Returns:
@@ -43,8 +43,8 @@ async def knowledge_search(
             "top_k": top_k,
             "top_k": top_k,
             "min_score": min_score,
             "min_score": min_score,
         }
         }
-        if tags_type:
-            params["tags_type"] = ",".join(tags_type)
+        if types:
+            params["types"] = ",".join(types)
 
 
         async with httpx.AsyncClient(timeout=60.0) as client:
         async with httpx.AsyncClient(timeout=60.0) as client:
             response = await client.get(f"{KNOWHUB_API}/api/knowledge/search", params=params)
             response = await client.get(f"{KNOWHUB_API}/api/knowledge/search", params=params)
@@ -65,8 +65,10 @@ async def knowledge_search(
         output_lines = [f"查询: {query}\n", f"找到 {count} 条相关知识:\n"]
         output_lines = [f"查询: {query}\n", f"找到 {count} 条相关知识:\n"]
 
 
         for idx, item in enumerate(results, 1):
         for idx, item in enumerate(results, 1):
-            output_lines.append(f"\n### {idx}. [{item['id']}] (⭐ {item.get('score', 3)})")
-            output_lines.append(f"**场景**: {item['scenario'][:150]}...")
+            eval_data = item.get("eval", {})
+            score = eval_data.get("score", 3)
+            output_lines.append(f"\n### {idx}. [{item['id']}] (⭐ {score})")
+            output_lines.append(f"**任务**: {item['task'][:150]}...")
             output_lines.append(f"**内容**: {item['content'][:200]}...")
             output_lines.append(f"**内容**: {item['content'][:200]}...")
 
 
         return ToolResult(
         return ToolResult(
@@ -91,11 +93,17 @@ async def knowledge_search(
 
 
 @tool()
 @tool()
 async def knowledge_save(
 async def knowledge_save(
-    scenario: str,
+    task: str,
     content: str,
     content: str,
-    tags_type: List[str],
+    types: List[str],
+    tags: Optional[Dict[str, str]] = None,
+    scopes: Optional[List[str]] = None,
+    owner: Optional[str] = None,
+    source_name: str = "",
+    source_category: str = "exp",
     urls: List[str] = None,
     urls: List[str] = None,
     agent_id: str = "research_agent",
     agent_id: str = "research_agent",
+    submitted_by: str = "",
     score: int = 3,
     score: int = 3,
     message_id: str = "",
     message_id: str = "",
     context: Optional[ToolContext] = None,
     context: Optional[ToolContext] = None,
@@ -104,11 +112,17 @@ async def knowledge_save(
     保存新知识
     保存新知识
 
 
     Args:
     Args:
-        scenario: 任务描述(在什么情景下 + 要完成什么目标)
+        task: 任务描述(在什么情景下 + 要完成什么目标)
         content: 核心内容
         content: 核心内容
-        tags_type: 知识类型标签,可选:tool, usecase, definition, plan, strategy
+        types: 知识类型标签,可选:user_profile, strategy, tool, usecase, definition, plan
+        tags: 业务标签(JSON 对象)
+        scopes: 可见范围(默认 ["org:cybertogether"])
+        owner: 所有者(默认 agent:{agent_id})
+        source_name: 来源名称
+        source_category: 来源类别(paper/exp/skill/book)
         urls: 参考来源链接列表
         urls: 参考来源链接列表
         agent_id: 执行此调研的 agent ID
         agent_id: 执行此调研的 agent ID
+        submitted_by: 提交者
         score: 初始评分 1-5(默认 3)
         score: 初始评分 1-5(默认 3)
         message_id: 来源 Message ID
         message_id: 来源 Message ID
         context: 工具上下文
         context: 工具上下文
@@ -117,14 +131,33 @@ async def knowledge_save(
         保存结果
         保存结果
     """
     """
     try:
     try:
+        # 设置默认值(在 agent 代码中,不是服务器端)
+        if scopes is None:
+            scopes = ["org:cybertogether"]
+        if owner is None:
+            owner = f"agent:{agent_id}"
+
         payload = {
         payload = {
-            "scenario": scenario,
+            "message_id": message_id,
+            "types": types,
+            "task": task,
+            "tags": tags or {},
+            "scopes": scopes,
+            "owner": owner,
             "content": content,
             "content": content,
-            "tags_type": tags_type,
-            "urls": urls or [],
-            "agent_id": agent_id,
-            "score": score,
-            "message_id": message_id
+            "source": {
+                "name": source_name,
+                "category": source_category,
+                "urls": urls or [],
+                "agent_id": agent_id,
+                "submitted_by": submitted_by,
+            },
+            "eval": {
+                "score": score,
+                "helpful": 1,
+                "harmful": 0,
+                "confidence": 0.5,
+            }
         }
         }
 
 
         async with httpx.AsyncClient(timeout=30.0) as client:
         async with httpx.AsyncClient(timeout=30.0) as client:
@@ -136,8 +169,8 @@ async def knowledge_save(
 
 
         return ToolResult(
         return ToolResult(
             title="✅ 知识已保存",
             title="✅ 知识已保存",
-            output=f"知识 ID: {knowledge_id}\n\n场景:\n{scenario[:100]}...",
-            long_term_memory=f"保存知识: {knowledge_id} - {scenario[:50]}",
+            output=f"知识 ID: {knowledge_id}\n\n任务:\n{task[:100]}...",
+            long_term_memory=f"保存知识: {knowledge_id} - {task[:50]}",
             metadata={"knowledge_id": knowledge_id}
             metadata={"knowledge_id": knowledge_id}
         )
         )
 
 
@@ -273,7 +306,8 @@ async def knowledge_batch_update(
 @tool()
 @tool()
 async def knowledge_list(
 async def knowledge_list(
     limit: int = 10,
     limit: int = 10,
-    tags_type: Optional[List[str]] = None,
+    types: Optional[List[str]] = None,
+    scopes: Optional[List[str]] = None,
     context: Optional[ToolContext] = None,
     context: Optional[ToolContext] = None,
 ) -> ToolResult:
 ) -> ToolResult:
     """
     """
@@ -281,7 +315,8 @@ async def knowledge_list(
 
 
     Args:
     Args:
         limit: 返回数量限制(默认 10)
         limit: 返回数量限制(默认 10)
-        tags_type: 按类型过滤(可选)
+        types: 按类型过滤(可选)
+        scopes: 按范围过滤(可选)
         context: 工具上下文
         context: 工具上下文
 
 
     Returns:
     Returns:
@@ -289,8 +324,10 @@ async def knowledge_list(
     """
     """
     try:
     try:
         params = {"limit": limit}
         params = {"limit": limit}
-        if tags_type:
-            params["tags_type"] = ",".join(tags_type)
+        if types:
+            params["types"] = ",".join(types)
+        if scopes:
+            params["scopes"] = ",".join(scopes)
 
 
         async with httpx.AsyncClient(timeout=30.0) as client:
         async with httpx.AsyncClient(timeout=30.0) as client:
             response = await client.get(f"{KNOWHUB_API}/api/knowledge", params=params)
             response = await client.get(f"{KNOWHUB_API}/api/knowledge", params=params)
@@ -309,8 +346,9 @@ async def knowledge_list(
 
 
         output_lines = [f"共找到 {count} 条知识:\n"]
         output_lines = [f"共找到 {count} 条知识:\n"]
         for item in results:
         for item in results:
-            score = item.get("eval", {}).get("score", 3)
-            output_lines.append(f"- [{item['id']}] (⭐{score}) {item['scenario'][:60]}...")
+            eval_data = item.get("eval", {})
+            score = eval_data.get("score", 3)
+            output_lines.append(f"- [{item['id']}] (⭐{score}) {item['task'][:60]}...")
 
 
         return ToolResult(
         return ToolResult(
             title="📚 知识列表",
             title="📚 知识列表",
@@ -329,14 +367,14 @@ async def knowledge_list(
 
 
 @tool()
 @tool()
 async def knowledge_slim(
 async def knowledge_slim(
-    model: str = "anthropic/claude-sonnet-4.5",
+    model: str = "google/gemini-2.0-flash-001",
     context: Optional[ToolContext] = None,
     context: Optional[ToolContext] = None,
 ) -> ToolResult:
 ) -> ToolResult:
     """
     """
     知识库瘦身:调用顶级大模型,将知识库中语义相似的知识合并精简
     知识库瘦身:调用顶级大模型,将知识库中语义相似的知识合并精简
 
 
     Args:
     Args:
-        model: 使用的模型(默认 claude-sonnet-4.5
+        model: 使用的模型(默认 gemini-2.0-flash-001
         context: 工具上下文
         context: 工具上下文
 
 
     Returns:
     Returns:
@@ -370,29 +408,3 @@ async def knowledge_slim(
             error=str(e)
             error=str(e)
         )
         )
 
 
-
-# 兼容接口:get_experience
-@tool(description="检索历史经验(strategy 标签的知识)")
-async def get_experience(
-    query: str,
-    k: int = 3,
-    context: Optional[ToolContext] = None,
-) -> ToolResult:
-    """
-    检索历史经验(兼容接口,实际调用 knowledge_search 并过滤 strategy 标签)
-
-    Args:
-        query: 搜索查询(任务描述)
-        k: 返回数量(默认 3)
-        context: 工具上下文
-
-    Returns:
-        相关经验列表
-    """
-    return await knowledge_search(
-        query=query,
-        top_k=k,
-        min_score=1,  # 经验的评分门槛较低
-        tags_type=["strategy"],
-        context=context
-    )

+ 0 - 1183
agent/tools/builtin/knowledge.py.backup

@@ -1,1183 +0,0 @@
-"""
-原子知识保存工具
-
-提供便捷的 API 让 Agent 快速保存结构化的原子知识
-"""
-
-import os
-import re
-import json
-import yaml
-import logging
-from datetime import datetime
-from pathlib import Path
-from typing import List, Dict, Optional, Any
-from agent.tools import tool, ToolResult, ToolContext
-from ...llm.openrouter import openrouter_llm_call
-
-logger = logging.getLogger(__name__)
-
-
-def _generate_knowledge_id() -> str:
-    """生成知识原子 ID(带微秒和随机后缀避免冲突)"""
-    import uuid
-    timestamp = datetime.now().strftime('%Y%m%d-%H%M%S')
-    random_suffix = uuid.uuid4().hex[:4]
-    return f"knowledge-{timestamp}-{random_suffix}"
-
-
-def _format_yaml_list(items: List[str], indent: int = 2) -> str:
-    """格式化 YAML 列表"""
-    if not items:
-        return "[]"
-    indent_str = " " * indent
-    return "\n" + "\n".join(f"{indent_str}- {item}" for item in items)
-
-
-@tool()
-async def save_knowledge(
-    scenario: str,
-    content: str,
-    tags_type: List[str],
-    urls: List[str] = None,
-    agent_id: str = "research_agent",
-    score: int = 3,
-    trace_id: str = "",
-) -> ToolResult:
-    """
-    保存原子知识到本地文件(JSON 格式)
-
-    Args:
-        scenario: 任务描述(在什么情景下 + 要完成什么目标 + 得到能达成一个什么结果)
-        content: 核心内容
-        tags_type: 知识类型标签,可选:tool, usercase, definition, plan, strategy
-        urls: 参考来源链接列表(论文/GitHub/博客等)
-        agent_id: 执行此调研的 agent ID
-        score: 初始评分 1-5(默认 3)
-        trace_id: 当前 trace ID(可选)
-
-    Returns:
-        保存结果
-    """
-    try:
-        # 生成 ID
-        knowledge_id = _generate_knowledge_id()
-
-        # 准备目录
-        knowledge_dir = Path(".cache/knowledge_atoms")
-        knowledge_dir.mkdir(parents=True, exist_ok=True)
-
-        # 构建文件路径(使用 .json 扩展名)
-        file_path = knowledge_dir / f"{knowledge_id}.json"
-
-        # 构建 JSON 数据结构
-        knowledge_data = {
-            "id": knowledge_id,
-            "trace_id": trace_id or "N/A",
-            "tags": {
-                "type": tags_type
-            },
-            "scenario": scenario,
-            "content": content,
-            "trace": {
-                "urls": urls or [],
-                "agent_id": agent_id,
-                "timestamp": datetime.now().isoformat()
-            },
-            "eval": {
-                "score": score,
-                "helpful": 0,
-                "harmful": 0,
-                "helpful_history": [],
-                "harmful_history": []
-            },
-            "metrics": {
-                "helpful": 1,
-                "harmful": 0
-            },
-            "created_at": datetime.now().strftime('%Y-%m-%d %H:%M:%S')
-        }
-
-        # 保存为 JSON 文件
-        with open(file_path, "w", encoding="utf-8") as f:
-            json.dump(knowledge_data, f, ensure_ascii=False, indent=2)
-
-        return ToolResult(
-            title="✅ 原子知识已保存",
-            output=f"知识 ID: {knowledge_id}\n文件路径: {file_path}\n\n场景:\n{scenario[:100]}...",
-            long_term_memory=f"保存原子知识: {knowledge_id} - {scenario[:50]}",
-            metadata={"knowledge_id": knowledge_id, "file_path": str(file_path)}
-        )
-
-    except Exception as e:
-        return ToolResult(
-            title="❌ 保存失败",
-            output=f"错误: {str(e)}",
-            error=str(e)
-        )
-
-
-@tool()
-async def update_knowledge(
-    knowledge_id: str,
-    add_helpful_case: Optional[Dict[str, str]] = None,
-    add_harmful_case: Optional[Dict[str, str]] = None,
-    update_score: Optional[int] = None,
-    evolve_feedback: Optional[str] = None,
-) -> ToolResult:
-    """
-    更新已有的原子知识的评估反馈
-
-    Args:
-        knowledge_id: 知识 ID(如 research-20260302-001)
-        add_helpful_case: 添加好用的案例 {"case_id": "...", "scenario": "...", "result": "...", "timestamp": "..."}
-        add_harmful_case: 添加不好用的案例 {"case_id": "...", "scenario": "...", "result": "...", "timestamp": "..."}
-        update_score: 更新评分(1-5)
-        evolve_feedback: 经验进化反馈(当提供时,会使用 LLM 重写知识内容)
-
-    Returns:
-        更新结果
-    """
-    try:
-        # 查找文件(支持 JSON 和 MD 格式)
-        knowledge_dir = Path(".cache/knowledge_atoms")
-        json_path = knowledge_dir / f"{knowledge_id}.json"
-        md_path = knowledge_dir / f"{knowledge_id}.md"
-
-        file_path = None
-        if json_path.exists():
-            file_path = json_path
-            is_json = True
-        elif md_path.exists():
-            file_path = md_path
-            is_json = False
-        else:
-            return ToolResult(
-                title="❌ 文件不存在",
-                output=f"未找到知识文件: {knowledge_id}",
-                error="文件不存在"
-            )
-
-        # 读取现有内容
-        with open(file_path, "r", encoding="utf-8") as f:
-            content = f.read()
-
-        # 解析数据
-        if is_json:
-            data = json.loads(content)
-        else:
-            # 解析 YAML frontmatter
-            yaml_match = re.search(r'^---\n(.*?)\n---', content, re.DOTALL)
-            if not yaml_match:
-                return ToolResult(
-                    title="❌ 格式错误",
-                    output=f"无法解析知识文件格式: {file_path}",
-                    error="格式错误"
-                )
-            data = yaml.safe_load(yaml_match.group(1))
-
-        # 更新内容
-        updated = False
-        summary = []
-
-        if add_helpful_case:
-            data["eval"]["helpful"] += 1
-            data["eval"]["helpful_history"].append(add_helpful_case)
-            data["metrics"]["helpful"] += 1
-            summary.append(f"添加 helpful 案例: {add_helpful_case.get('case_id')}")
-            updated = True
-
-        if add_harmful_case:
-            data["eval"]["harmful"] += 1
-            data["eval"]["harmful_history"].append(add_harmful_case)
-            data["metrics"]["harmful"] += 1
-            summary.append(f"添加 harmful 案例: {add_harmful_case.get('case_id')}")
-            updated = True
-
-        if update_score is not None:
-            data["eval"]["score"] = update_score
-            summary.append(f"更新评分: {update_score}")
-            updated = True
-
-        # 经验进化机制
-        if evolve_feedback:
-            old_content = data.get("content", "")
-            evolved_content = await _evolve_knowledge_with_llm(old_content, evolve_feedback)
-            data["content"] = evolved_content
-            data["metrics"]["helpful"] += 1
-            summary.append(f"知识进化: 基于反馈重写内容")
-            updated = True
-
-        if not updated:
-            return ToolResult(
-                title="⚠️ 无更新",
-                output="未指定任何更新内容",
-                long_term_memory="尝试更新原子知识但未指定更新内容"
-            )
-
-        # 更新时间戳
-        data["updated_at"] = datetime.now().strftime('%Y-%m-%d %H:%M:%S')
-
-        # 保存更新
-        if is_json:
-            with open(file_path, "w", encoding="utf-8") as f:
-                json.dump(data, f, ensure_ascii=False, indent=2)
-        else:
-            # 重新生成 YAML frontmatter
-            meta_str = yaml.dump(data, allow_unicode=True).strip()
-            with open(file_path, "w", encoding="utf-8") as f:
-                f.write(f"---\n{meta_str}\n---\n")
-
-        return ToolResult(
-            title="✅ 原子知识已更新",
-            output=f"知识 ID: {knowledge_id}\n文件路径: {file_path}\n\n更新内容:\n" + "\n".join(f"- {s}" for s in summary),
-            long_term_memory=f"更新原子知识: {knowledge_id}"
-        )
-
-    except Exception as e:
-        return ToolResult(
-            title="❌ 更新失败",
-            output=f"错误: {str(e)}",
-            error=str(e)
-        )
-
-
-@tool()
-async def list_knowledge(
-    limit: int = 10,
-    tags_type: Optional[List[str]] = None,
-) -> ToolResult:
-    """
-    列出已保存的原子知识
-
-    Args:
-        limit: 返回数量限制(默认 10)
-        tags_type: 按类型过滤(可选)
-
-    Returns:
-        知识列表
-    """
-    try:
-        knowledge_dir = Path(".cache/knowledge_atoms")
-
-        if not knowledge_dir.exists():
-            return ToolResult(
-                title="📂 知识库为空",
-                output="还没有保存任何原子知识",
-                long_term_memory="知识库为空"
-            )
-
-        # 获取所有文件
-        files = sorted(knowledge_dir.glob("*.md"), key=lambda x: x.stat().st_mtime, reverse=True)
-
-        if not files:
-            return ToolResult(
-                title="📂 知识库为空",
-                output="还没有保存任何原子知识",
-                long_term_memory="知识库为空"
-            )
-
-        # 读取并过滤
-        results = []
-        for file_path in files[:limit]:
-            with open(file_path, "r", encoding="utf-8") as f:
-                content = f.read()
-
-            # 提取关键信息
-            import re
-            id_match = re.search(r"id: (.+)", content)
-            scenario_match = re.search(r"scenario: \|\n  (.+)", content)
-            score_match = re.search(r"score: (\d+)", content)
-
-            knowledge_id = id_match.group(1) if id_match else "unknown"
-            scenario = scenario_match.group(1) if scenario_match else "N/A"
-            score = score_match.group(1) if score_match else "N/A"
-
-            results.append(f"- [{knowledge_id}] (⭐{score}) {scenario[:60]}...")
-
-        output = f"共找到 {len(files)} 条原子知识,显示最近 {len(results)} 条:\n\n" + "\n".join(results)
-
-        return ToolResult(
-            title="📚 原子知识列表",
-            output=output,
-            long_term_memory=f"列出 {len(results)} 条原子知识"
-        )
-
-    except Exception as e:
-        return ToolResult(
-            title="❌ 列表失败",
-            output=f"错误: {str(e)}",
-            error=str(e)
-        )
-
-
-# ===== 语义检索功能 =====
-
-async def _route_knowledge_by_llm(query_text: str, metadata_list: List[Dict], k: int = 5) -> List[str]:
-    """
-    第一阶段:语义路由。
-    让 LLM 挑选出 2*k 个语义相关的 ID。
-    """
-    if not metadata_list:
-        return []
-
-    # 扩大筛选范围到 2*k
-    routing_k = k * 2
-
-    routing_data = [
-        {
-            "id": m["id"],
-            "tags": m["tags"],
-            "scenario": m["scenario"][:100]  # 只取前100字符
-        } for m in metadata_list
-    ]
-
-    prompt = f"""
-你是一个知识检索专家。根据用户的当前任务需求,从下列原子知识元数据中挑选出最相关的最多 {routing_k} 个知识 ID。
-任务需求:"{query_text}"
-
-可选知识列表:
-{json.dumps(routing_data, ensure_ascii=False, indent=1)}
-
-请直接输出 ID 列表,用逗号分隔(例如: knowledge-20260302-001, research-20260302-002)。若无相关项请输出 "None"。
-"""
-
-    try:
-        print(f"\n[Step 1: 知识语义路由] 任务: '{query_text}' | 候选总数: {len(metadata_list)} | 目标提取数: {routing_k}")
-
-        response = await openrouter_llm_call(
-            messages=[{"role": "user", "content": prompt}],
-            model="google/gemini-2.0-flash-001"
-        )
-
-        content = response.get("content", "").strip()
-        selected_ids = [idx.strip() for idx in re.split(r'[,\s]+', content) if idx.strip().startswith(("knowledge-", "research-"))]
-
-        print(f"[Step 1: 知识语义路由] LLM 初选 ID ({len(selected_ids)}个): {selected_ids}")
-        return selected_ids
-    except Exception as e:
-        logger.error(f"LLM 知识路由失败: {e}")
-        return []
-
-
-async def _evolve_knowledge_with_llm(old_content: str, feedback: str) -> str:
-    """
-    使用 LLM 进行知识进化重写(类似经验进化机制)
-    """
-    prompt = f"""你是一个 AI Agent 知识库管理员。请根据反馈建议,对现有的知识内容进行重写进化。
-
-【原知识内容】:
-{old_content}
-
-【实战反馈建议】:
-{feedback}
-
-【重写要求】:
-1. 融合知识:将反馈中的避坑指南、新参数或修正后的选择逻辑融入原知识,使其更具通用性和准确性。
-2. 保持结构:如果原内容有特定格式(如 Markdown、代码示例等),请保持该格式。
-3. 语言:简洁直接,使用中文。
-4. 禁止:严禁输出任何开场白、解释语或额外的 Markdown 标题,直接返回重写后的正文。
-"""
-    try:
-        response = await openrouter_llm_call(
-            messages=[{"role": "user", "content": prompt}],
-            model="google/gemini-2.0-flash-001"
-        )
-
-        evolved_content = response.get("content", "").strip()
-
-        # 简单安全校验:如果 LLM 返回太短或为空,回退到原内容+追加
-        if len(evolved_content) < 5:
-            raise ValueError("LLM output too short")
-
-        return evolved_content
-
-    except Exception as e:
-        logger.warning(f"知识进化失败,采用追加模式回退: {e}")
-        timestamp = datetime.now().strftime('%Y-%m-%d')
-        return f"{old_content}\n\n---\n[Update {timestamp}]: {feedback}"
-
-
-async def _route_knowledge_by_llm(query_text: str, metadata_list: List[Dict], k: int = 5) -> List[str]:
-    """
-    第一阶段:语义路由。
-    让 LLM 挑选出 2*k 个语义相关的 ID。
-    """
-    if not metadata_list:
-        return []
-
-    # 扩大筛选范围到 2*k
-    routing_k = k * 2
-
-    routing_data = [
-        {
-            "id": m["id"],
-            "tags": m["tags"],
-            "scenario": m["scenario"][:100]  # 只取前100字符
-        } for m in metadata_list
-    ]
-
-    prompt = f"""
-你是一个知识检索专家。根据用户的当前任务需求,从下列原子知识元数据中挑选出最相关的最多 {routing_k} 个知识 ID。
-任务需求:"{query_text}"
-
-可选知识列表:
-{json.dumps(routing_data, ensure_ascii=False, indent=1)}
-
-请直接输出 ID 列表,用逗号分隔(例如: knowledge-20260302-001, research-20260302-002)。若无相关项请输出 "None"。
-"""
-
-    try:
-        print(f"\n[Step 1: 知识语义路由] 任务: '{query_text}' | 候选总数: {len(metadata_list)} | 目标提取数: {routing_k}")
-
-        response = await openrouter_llm_call(
-            messages=[{"role": "user", "content": prompt}],
-            model="google/gemini-2.0-flash-001"
-        )
-
-        content = response.get("content", "").strip()
-        selected_ids = [idx.strip() for idx in re.split(r'[,\s]+', content) if idx.strip().startswith(("knowledge-", "research-"))]
-
-        print(f"[Step 1: 知识语义路由] LLM 初选 ID ({len(selected_ids)}个): {selected_ids}")
-        return selected_ids
-    except Exception as e:
-        logger.error(f"LLM 知识路由失败: {e}")
-        return []
-
-
-async def _get_structured_knowledge(
-    query_text: str,
-    top_k: int = 5,
-    min_score: int = 3,
-    context: Optional[Any] = None,
-    tags_filter: Optional[List[str]] = None
-) -> List[Dict]:
-    """
-    语义检索原子知识(包括经验)
-
-    1. 解析知识库文件(支持 JSON 和 YAML 格式)
-    2. 语义路由:提取 2*k 个 ID
-    3. 质量精排:基于评分筛选出最终的 k 个
-
-    Args:
-        query_text: 查询文本
-        top_k: 返回数量
-        min_score: 最低评分过滤
-        context: 上下文(兼容 experience 接口)
-        tags_filter: 标签过滤(如 ["strategy"] 只返回经验)
-    """
-    knowledge_dir = Path(".cache/knowledge_atoms")
-
-    if not knowledge_dir.exists():
-        print(f"[Knowledge System] 警告: 知识库目录不存在 ({knowledge_dir})")
-        return []
-
-    # 同时支持 .json 和 .md 文件
-    json_files = list(knowledge_dir.glob("*.json"))
-    md_files = list(knowledge_dir.glob("*.md"))
-    files = json_files + md_files
-
-    if not files:
-        print(f"[Knowledge System] 警告: 知识库为空")
-        return []
-
-    # --- 阶段 1: 解析所有知识文件 ---
-    content_map = {}
-    metadata_list = []
-
-    for file_path in files:
-        try:
-            with open(file_path, "r", encoding="utf-8") as f:
-                content = f.read()
-
-            # 根据文件扩展名选择解析方式
-            if file_path.suffix == ".json":
-                # 解析 JSON 格式
-                metadata = json.loads(content)
-            else:
-                # 解析 YAML frontmatter(兼容旧格式)
-                yaml_match = re.search(r'^---\n(.*?)\n---', content, re.DOTALL)
-                if not yaml_match:
-                    logger.warning(f"跳过无效文件: {file_path}")
-                    continue
-                metadata = yaml.safe_load(yaml_match.group(1))
-
-            if not isinstance(metadata, dict):
-                logger.warning(f"跳过损坏的知识文件: {file_path}")
-                continue
-
-            kid = metadata.get("id")
-            if not kid:
-                logger.warning(f"跳过缺少 id 的知识文件: {file_path}")
-                continue
-
-            # 提取 scenario 和 content
-            scenario = metadata.get("scenario", "").strip()
-            content_text = metadata.get("content", "").strip()
-
-            # 标签过滤
-            tags = metadata.get("tags", {})
-            if tags_filter:
-                # 检查 tags.type 是否包含任何过滤标签
-                tag_types = tags.get("type", [])
-                if isinstance(tag_types, str):
-                    tag_types = [tag_types]
-                if not any(tag in tag_types for tag in tags_filter):
-                    continue  # 跳过不匹配的标签
-
-            meta_item = {
-                "id": kid,
-                "tags": tags,
-                "scenario": scenario,
-                "score": metadata.get("eval", {}).get("score", 3),
-                "helpful": metadata.get("metrics", {}).get("helpful", 0),
-                "harmful": metadata.get("metrics", {}).get("harmful", 0),
-            }
-            metadata_list.append(meta_item)
-            content_map[kid] = {
-                "scenario": scenario,
-                "content": content_text,
-                "tags": tags,
-                "score": meta_item["score"],
-                "helpful": meta_item["helpful"],
-                "harmful": meta_item["harmful"],
-            }
-        except Exception as e:
-            logger.error(f"解析知识文件失败 {file_path}: {e}")
-            continue
-
-    if not metadata_list:
-        print(f"[Knowledge System] 警告: 没有有效的知识条目")
-        return []
-
-    # --- 阶段 2: 语义路由 (取 2*k) ---
-    candidate_ids = await _route_knowledge_by_llm(query_text, metadata_list, k=top_k)
-
-    # --- 阶段 3: 质量精排 (根据评分和反馈选出最终的 k) ---
-    print(f"[Step 2: 知识质量精排] 正在根据评分和反馈进行打分...")
-    scored_items = []
-
-    for kid in candidate_ids:
-        if kid in content_map:
-            item = content_map[kid]
-            score = item["score"]
-            helpful = item["helpful"]
-            harmful = item["harmful"]
-
-            # 计算综合分:基础分 + helpful - harmful*2
-            quality_score = score + helpful - (harmful * 2.0)
-
-            # 过滤门槛:评分低于 min_score 或质量分过低
-            if score < min_score or quality_score < 0:
-                print(f"  - 剔除低质量知识: {kid} (Score: {score}, Helpful: {helpful}, Harmful: {harmful})")
-                continue
-
-            scored_items.append({
-                "id": kid,
-                "scenario": item["scenario"],
-                "content": item["content"],
-                "tags": item["tags"],
-                "score": score,
-                "quality_score": quality_score,
-                "metrics": {
-                    "helpful": helpful,
-                    "harmful": harmful
-                }
-            })
-
-    # 按照质量分排序
-    final_sorted = sorted(scored_items, key=lambda x: x["quality_score"], reverse=True)
-
-    # 截取最终的 top_k
-    result = final_sorted[:top_k]
-
-    print(f"[Step 2: 知识质量精排] 最终选定知识: {[it['id'] for it in result]}")
-    print(f"[Knowledge System] 检索结束。\n")
-    return result
-
-
-@tool()
-async def search_knowledge(
-    query: str,
-    top_k: int = 5,
-    min_score: int = 3,
-    tags_type: Optional[List[str]] = None,
-    context: Optional[ToolContext] = None,
-) -> ToolResult:
-    """
-    语义检索原子知识库
-
-    Args:
-        query: 搜索查询(任务描述)
-        top_k: 返回数量(默认 5)
-        min_score: 最低评分过滤(默认 3)
-        tags_type: 按类型过滤(tool/usercase/definition/plan)
-        context: 工具上下文
-
-    Returns:
-        相关知识列表
-    """
-    try:
-        relevant_items = await _get_structured_knowledge(
-            query_text=query,
-            top_k=top_k,
-            min_score=min_score
-        )
-
-        if not relevant_items:
-            return ToolResult(
-                title="🔍 未找到相关知识",
-                output=f"查询: {query}\n\n知识库中暂无相关的高质量知识。建议进行调研。",
-                long_term_memory=f"知识检索: 未找到相关知识 - {query[:50]}"
-            )
-
-        # 格式化输出
-        output_lines = [f"查询: {query}\n", f"找到 {len(relevant_items)} 条相关知识:\n"]
-
-        for idx, item in enumerate(relevant_items, 1):
-            output_lines.append(f"\n### {idx}. [{item['id']}] (⭐ {item['score']})")
-            output_lines.append(f"**场景**: {item['scenario'][:150]}...")
-            output_lines.append(f"**内容**: {item['content'][:200]}...")
-
-        return ToolResult(
-            title="✅ 知识检索成功",
-            output="\n".join(output_lines),
-            long_term_memory=f"知识检索: 找到 {len(relevant_items)} 条相关知识 - {query[:50]}",
-            metadata={
-                "count": len(relevant_items),
-                "knowledge_ids": [item["id"] for item in relevant_items],
-                "items": relevant_items
-            }
-        )
-
-    except Exception as e:
-        logger.error(f"知识检索失败: {e}")
-        return ToolResult(
-            title="❌ 检索失败",
-            output=f"错误: {str(e)}",
-            error=str(e)
-        )
-
-
-@tool(description="通过两阶段检索获取最相关的历史经验(strategy 标签的知识)")
-async def get_experience(
-    query: str,
-    k: int = 3,
-    context: Optional[ToolContext] = None,
-) -> ToolResult:
-    """
-    检索历史经验(兼容旧接口,实际调用 search_knowledge 并过滤 strategy 标签)
-
-    Args:
-        query: 搜索查询(任务描述)
-        k: 返回数量(默认 3)
-        context: 工具上下文
-
-    Returns:
-        相关经验列表
-    """
-    try:
-        relevant_items = await _get_structured_knowledge(
-            query_text=query,
-            top_k=k,
-            min_score=1,  # 经验的评分门槛较低
-            context=context,
-            tags_filter=["strategy"]  # 只返回经验
-        )
-
-        if not relevant_items:
-            return ToolResult(
-                title="🔍 未找到相关经验",
-                output=f"查询: {query}\n\n经验库中暂无相关的经验。",
-                long_term_memory=f"经验检索: 未找到相关经验 - {query[:50]}",
-                metadata={"items": [], "count": 0}
-            )
-
-        # 格式化输出(兼容旧格式)
-        output_lines = [f"查询: {query}\n", f"找到 {len(relevant_items)} 条相关经验:\n"]
-
-        for idx, item in enumerate(relevant_items, 1):
-            output_lines.append(f"\n### {idx}. [{item['id']}]")
-            output_lines.append(f"{item['content'][:300]}...")
-
-        return ToolResult(
-            title="✅ 经验检索成功",
-            output="\n".join(output_lines),
-            long_term_memory=f"经验检索: 找到 {len(relevant_items)} 条相关经验 - {query[:50]}",
-            metadata={
-                "items": relevant_items,
-                "count": len(relevant_items)
-            }
-        )
-
-    except Exception as e:
-        logger.error(f"经验检索失败: {e}")
-        return ToolResult(
-            title="❌ 检索失败",
-            output=f"错误: {str(e)}",
-            error=str(e)
-        )
-
-
-# ===== 批量更新功能(类似经验机制)=====
-
-async def _batch_update_knowledge(
-    update_map: Dict[str, Dict[str, Any]],
-    context: Optional[Any] = None
-) -> int:
-    """
-    内部函数:批量更新知识(兼容 experience 接口)
-
-    Args:
-        update_map: 更新映射 {knowledge_id: {"action": "helpful/harmful/evolve", "feedback": "..."}}
-        context: 上下文(兼容 experience 接口)
-
-    Returns:
-        成功更新的数量
-    """
-    if not update_map:
-        return 0
-
-    knowledge_dir = Path(".cache/knowledge_atoms")
-    if not knowledge_dir.exists():
-        return 0
-
-    success_count = 0
-    evolution_tasks = []
-    evolution_registry = {}  # task_idx -> (file_path, data)
-
-    for knowledge_id, instr in update_map.items():
-        try:
-            # 查找文件
-            json_path = knowledge_dir / f"{knowledge_id}.json"
-            md_path = knowledge_dir / f"{knowledge_id}.md"
-
-            file_path = None
-            is_json = False
-            if json_path.exists():
-                file_path = json_path
-                is_json = True
-            elif md_path.exists():
-                file_path = md_path
-                is_json = False
-            else:
-                continue
-
-            # 读取并解析
-            with open(file_path, "r", encoding="utf-8") as f:
-                content = f.read()
-
-            if is_json:
-                data = json.loads(content)
-            else:
-                yaml_match = re.search(r'^---\n(.*?)\n---', content, re.DOTALL)
-                if not yaml_match:
-                    continue
-                data = yaml.safe_load(yaml_match.group(1))
-
-            # 更新 metrics
-            action = instr.get("action")
-            feedback = instr.get("feedback", "")
-
-            # 处理 mixed 中间态
-            if action == "mixed":
-                data["metrics"]["helpful"] = data.get("metrics", {}).get("helpful", 0) + 1
-                action = "evolve"
-
-            if action == "helpful":
-                data["metrics"]["helpful"] = data.get("metrics", {}).get("helpful", 0) + 1
-            elif action == "harmful":
-                data["metrics"]["harmful"] = data.get("metrics", {}).get("harmful", 0) + 1
-            elif action == "evolve" and feedback:
-                # 注册进化任务
-                old_content = data.get("content", "")
-                task = _evolve_knowledge_with_llm(old_content, feedback)
-                evolution_tasks.append(task)
-                evolution_registry[len(evolution_tasks) - 1] = (file_path, data, is_json)
-                data["metrics"]["helpful"] = data.get("metrics", {}).get("helpful", 0) + 1
-
-            data["updated_at"] = datetime.now().strftime('%Y-%m-%d %H:%M:%S')
-
-            # 如果不需要进化,直接保存
-            if action != "evolve" or not feedback:
-                if is_json:
-                    with open(file_path, "w", encoding="utf-8") as f:
-                        json.dump(data, f, ensure_ascii=False, indent=2)
-                else:
-                    meta_str = yaml.dump(data, allow_unicode=True).strip()
-                    with open(file_path, "w", encoding="utf-8") as f:
-                        f.write(f"---\n{meta_str}\n---\n")
-                success_count += 1
-
-        except Exception as e:
-            logger.error(f"更新知识失败 {knowledge_id}: {e}")
-            continue
-
-    # 并发进化
-    if evolution_tasks:
-        import asyncio
-        print(f"🧬 并发处理 {len(evolution_tasks)} 条知识进化...")
-        evolved_results = await asyncio.gather(*evolution_tasks)
-
-        # 回填进化结果
-        for task_idx, (file_path, data, is_json) in evolution_registry.items():
-            data["content"] = evolved_results[task_idx].strip()
-
-            if is_json:
-                with open(file_path, "w", encoding="utf-8") as f:
-                    json.dump(data, f, ensure_ascii=False, indent=2)
-            else:
-                meta_str = yaml.dump(data, allow_unicode=True).strip()
-                with open(file_path, "w", encoding="utf-8") as f:
-                    f.write(f"---\n{meta_str}\n---\n")
-            success_count += 1
-
-    return success_count
-
-
-@tool()
-async def batch_update_knowledge(
-    feedback_list: List[Dict[str, Any]],
-    context: Optional[ToolContext] = None,
-) -> ToolResult:
-    """
-    批量反馈知识的有效性(类似经验机制)
-
-    Args:
-        feedback_list: 评价列表,每个元素包含:
-            - knowledge_id: (str) 知识 ID
-            - is_effective: (bool) 是否有效
-            - feedback: (str, optional) 改进建议,若有效且有建议则触发知识进化
-
-    Returns:
-        批量更新结果
-    """
-    try:
-        if not feedback_list:
-            return ToolResult(
-                title="⚠️ 反馈列表为空",
-                output="未提供任何反馈",
-                long_term_memory="批量更新知识: 反馈列表为空"
-            )
-
-        knowledge_dir = Path(".cache/knowledge_atoms")
-        if not knowledge_dir.exists():
-            return ToolResult(
-                title="❌ 知识库不存在",
-                output="知识库目录不存在",
-                error="知识库不存在"
-            )
-
-        success_count = 0
-        failed_items = []
-
-        for item in feedback_list:
-            knowledge_id = item.get("knowledge_id")
-            is_effective = item.get("is_effective")
-            feedback = item.get("feedback", "")
-
-            if not knowledge_id:
-                failed_items.append({"id": "unknown", "reason": "缺少 knowledge_id"})
-                continue
-
-            try:
-                # 查找文件
-                json_path = knowledge_dir / f"{knowledge_id}.json"
-                md_path = knowledge_dir / f"{knowledge_id}.md"
-
-                file_path = None
-                is_json = False
-                if json_path.exists():
-                    file_path = json_path
-                    is_json = True
-                elif md_path.exists():
-                    file_path = md_path
-                    is_json = False
-                else:
-                    failed_items.append({"id": knowledge_id, "reason": "文件不存在"})
-                    continue
-
-                # 读取并解析
-                with open(file_path, "r", encoding="utf-8") as f:
-                    content = f.read()
-
-                if is_json:
-                    data = json.loads(content)
-                else:
-                    yaml_match = re.search(r'^---\n(.*?)\n---', content, re.DOTALL)
-                    if not yaml_match:
-                        failed_items.append({"id": knowledge_id, "reason": "格式错误"})
-                        continue
-                    data = yaml.safe_load(yaml_match.group(1))
-
-                # 更新 metrics
-                if is_effective:
-                    data["metrics"]["helpful"] = data.get("metrics", {}).get("helpful", 0) + 1
-                    # 如果有反馈建议,触发进化
-                    if feedback:
-                        old_content = data.get("content", "")
-                        evolved_content = await _evolve_knowledge_with_llm(old_content, feedback)
-                        data["content"] = evolved_content
-                else:
-                    data["metrics"]["harmful"] = data.get("metrics", {}).get("harmful", 0) + 1
-
-                data["updated_at"] = datetime.now().strftime('%Y-%m-%d %H:%M:%S')
-
-                # 保存
-                if is_json:
-                    with open(file_path, "w", encoding="utf-8") as f:
-                        json.dump(data, f, ensure_ascii=False, indent=2)
-                else:
-                    meta_str = yaml.dump(data, allow_unicode=True).strip()
-                    with open(file_path, "w", encoding="utf-8") as f:
-                        f.write(f"---\n{meta_str}\n---\n")
-
-                success_count += 1
-
-            except Exception as e:
-                failed_items.append({"id": knowledge_id, "reason": str(e)})
-                continue
-
-        output_lines = [f"成功更新 {success_count} 条知识"]
-        if failed_items:
-            output_lines.append(f"\n失败 {len(failed_items)} 条:")
-            for item in failed_items:
-                output_lines.append(f"  - {item['id']}: {item['reason']}")
-
-        return ToolResult(
-            title="✅ 批量更新完成",
-            output="\n".join(output_lines),
-            long_term_memory=f"批量更新知识: 成功 {success_count} 条,失败 {len(failed_items)} 条"
-        )
-
-    except Exception as e:
-        logger.error(f"批量更新知识失败: {e}")
-        return ToolResult(
-            title="❌ 批量更新失败",
-            output=f"错误: {str(e)}",
-            error=str(e)
-        )
-
-
-# ===== 知识库瘦身功能(类似经验机制)=====
-
-@tool()
-async def slim_knowledge(
-    model: str = "anthropic/claude-sonnet-4.5",
-    context: Optional[ToolContext] = None,
-) -> ToolResult:
-    """
-    知识库瘦身:调用顶级大模型,将知识库中语义相似的知识合并精简
-
-    Args:
-        model: 使用的模型(默认 claude-sonnet-4.5)
-        context: 工具上下文
-
-    Returns:
-        瘦身结果报告
-    """
-    try:
-        knowledge_dir = Path(".cache/knowledge_atoms")
-
-        if not knowledge_dir.exists():
-            return ToolResult(
-                title="📂 知识库不存在",
-                output="知识库目录不存在,无需瘦身",
-                long_term_memory="知识库瘦身: 目录不存在"
-            )
-
-        # 获取所有文件
-        json_files = list(knowledge_dir.glob("*.json"))
-        md_files = list(knowledge_dir.glob("*.md"))
-        files = json_files + md_files
-
-        if len(files) < 2:
-            return ToolResult(
-                title="📂 知识库过小",
-                output=f"知识库仅有 {len(files)} 条,无需瘦身",
-                long_term_memory=f"知识库瘦身: 仅有 {len(files)} 条"
-            )
-
-        # 解析所有知识
-        parsed = []
-        for file_path in files:
-            try:
-                with open(file_path, "r", encoding="utf-8") as f:
-                    content = f.read()
-
-                if file_path.suffix == ".json":
-                    data = json.loads(content)
-                else:
-                    yaml_match = re.search(r'^---\n(.*?)\n---', content, re.DOTALL)
-                    if not yaml_match:
-                        continue
-                    data = yaml.safe_load(yaml_match.group(1))
-
-                parsed.append({
-                    "file_path": file_path,
-                    "data": data,
-                    "is_json": file_path.suffix == ".json"
-                })
-            except Exception as e:
-                logger.error(f"解析文件失败 {file_path}: {e}")
-                continue
-
-        if len(parsed) < 2:
-            return ToolResult(
-                title="📂 有效知识过少",
-                output=f"有效知识仅有 {len(parsed)} 条,无需瘦身",
-                long_term_memory=f"知识库瘦身: 有效知识 {len(parsed)} 条"
-            )
-
-        # 构造发给大模型的内容
-        entries_text = ""
-        for p in parsed:
-            data = p["data"]
-            entries_text += f"[ID: {data.get('id')}] [Tags: {data.get('tags', {})}] "
-            entries_text += f"[Metrics: {data.get('metrics', {})}] [Score: {data.get('eval', {}).get('score', 3)}]\n"
-            entries_text += f"Scenario: {data.get('scenario', 'N/A')}\n"
-            entries_text += f"Content: {data.get('content', '')[:200]}...\n\n"
-
-        prompt = f"""你是一个 AI Agent 知识库管理员。以下是当前知识库的全部条目,请执行瘦身操作:
-
-【任务】:
-1. 识别语义高度相似或重复的知识,将它们合并为一条更精炼、更通用的知识。
-2. 合并时保留 helpful 最高的那条的 ID 和 metrics(metrics 中 helpful/harmful 取各条之和)。
-3. 对于独立的、无重复的知识,保持原样不动。
-4. 保持原有的知识结构和格式。
-
-【当前知识库】:
-{entries_text}
-
-【输出格式要求】:
-严格按以下格式输出每条知识,条目之间用 === 分隔:
-ID: <保留的id>
-TAGS: <yaml格式的tags>
-METRICS: <yaml格式的metrics>
-SCORE: <评分>
-SCENARIO: <场景描述>
-CONTENT: <合并后的知识内容>
-===
-
-最后一行输出合并报告,格式:
-REPORT: 原有 X 条,合并后 Y 条,精简了 Z 条。
-
-禁止输出任何开场白或解释。"""
-
-        print(f"\n[知识瘦身] 正在调用 {model} 分析 {len(parsed)} 条知识...")
-        response = await openrouter_llm_call(
-            messages=[{"role": "user", "content": prompt}],
-            model=model
-        )
-        content = response.get("content", "").strip()
-        if not content:
-            return ToolResult(
-                title="❌ 大模型返回为空",
-                output="大模型返回为空,瘦身失败",
-                error="大模型返回为空"
-            )
-
-        # 解析大模型输出
-        report_line = ""
-        new_entries = []
-        blocks = [b.strip() for b in content.split("===") if b.strip()]
-
-        for block in blocks:
-            if block.startswith("REPORT:"):
-                report_line = block
-                continue
-
-            lines = block.split("\n")
-            kid, tags, metrics, score, scenario, content_lines = None, {}, {}, 3, "", []
-            current_field = None
-
-            for line in lines:
-                if line.startswith("ID:"):
-                    kid = line[3:].strip()
-                    current_field = None
-                elif line.startswith("TAGS:"):
-                    try:
-                        tags = yaml.safe_load(line[5:].strip()) or {}
-                    except Exception:
-                        tags = {}
-                    current_field = None
-                elif line.startswith("METRICS:"):
-                    try:
-                        metrics = yaml.safe_load(line[8:].strip()) or {}
-                    except Exception:
-                        metrics = {"helpful": 0, "harmful": 0}
-                    current_field = None
-                elif line.startswith("SCORE:"):
-                    try:
-                        score = int(line[6:].strip())
-                    except Exception:
-                        score = 3
-                    current_field = None
-                elif line.startswith("SCENARIO:"):
-                    scenario = line[9:].strip()
-                    current_field = "scenario"
-                elif line.startswith("CONTENT:"):
-                    content_lines.append(line[8:].strip())
-                    current_field = "content"
-                elif current_field == "scenario":
-                    scenario += "\n" + line
-                elif current_field == "content":
-                    content_lines.append(line)
-
-            if kid and content_lines:
-                new_data = {
-                    "id": kid,
-                    "tags": tags,
-                    "scenario": scenario,
-                    "content": "\n".join(content_lines).strip(),
-                    "metrics": metrics,
-                    "eval": {
-                        "score": score,
-                        "helpful": 0,
-                        "harmful": 0,
-                        "helpful_history": [],
-                        "harmful_history": []
-                    },
-                    "updated_at": datetime.now().strftime('%Y-%m-%d %H:%M:%S'),
-                }
-                new_entries.append(new_data)
-
-        if not new_entries:
-            return ToolResult(
-                title="❌ 解析失败",
-                output="解析大模型输出失败,知识库未修改",
-                error="解析失败"
-            )
-
-        # 删除旧文件
-        for p in parsed:
-            try:
-                p["file_path"].unlink()
-            except Exception as e:
-                logger.error(f"删除旧文件失败 {p['file_path']}: {e}")
-
-        # 写入新文件(统一使用 JSON 格式)
-        for data in new_entries:
-            file_path = knowledge_dir / f"{data['id']}.json"
-            with open(file_path, "w", encoding="utf-8") as f:
-                json.dump(data, f, ensure_ascii=False, indent=2)
-
-        result = f"瘦身完成:{len(parsed)} → {len(new_entries)} 条知识"
-        if report_line:
-            result += f"\n{report_line}"
-
-        print(f"[知识瘦身] {result}")
-        return ToolResult(
-            title="✅ 知识库瘦身完成",
-            output=result,
-            long_term_memory=f"知识库瘦身: {len(parsed)} → {len(new_entries)} 条"
-        )
-
-    except Exception as e:
-        logger.error(f"知识库瘦身失败: {e}")
-        return ToolResult(
-            title="❌ 瘦身失败",
-            output=f"错误: {str(e)}",
-            error=str(e)
-        )
-

+ 0 - 1
api_server.py

@@ -73,7 +73,6 @@ from agent.llm import create_openrouter_llm_call
 runner = AgentRunner(
 runner = AgentRunner(
     trace_store=trace_store,
     trace_store=trace_store,
     llm_call=create_openrouter_llm_call(model="anthropic/claude-sonnet-4.5"),
     llm_call=create_openrouter_llm_call(model="anthropic/claude-sonnet-4.5"),
-    experiences_path="./.cache/experiences.md",  # 经验文件路径
 )
 )
 set_runner(runner)
 set_runner(runner)
 
 

+ 1 - 1
examples/analyze_story/run.py

@@ -153,7 +153,7 @@ async def show_interactive_menu(
 
 
             # 2. 结构化解析与保存 (ACE Curator 逻辑)
             # 2. 结构化解析与保存 (ACE Curator 逻辑)
             if reflection_text:
             if reflection_text:
-                experiences_path = runner.experiences_path or "./.cache/experiences.md"
+                # experiences_path = runner.experiences_path  # 已废弃,使用知识系统 or "./.cache/experiences.md"
                 os.makedirs(os.path.dirname(experiences_path), exist_ok=True)
                 os.makedirs(os.path.dirname(experiences_path), exist_ok=True)
                 
                 
                 # 正则匹配:- [intent:..., state:...] 内容
                 # 正则匹配:- [intent:..., state:...] 内容

+ 1 - 2
examples/deep_research/run.py

@@ -177,7 +177,7 @@ async def show_interactive_menu(
             # 追加到 experiences 文件
             # 追加到 experiences 文件
             if reflection_text:
             if reflection_text:
                 from datetime import datetime
                 from datetime import datetime
-                experiences_path = runner.experiences_path or "./.cache/experiences_find.md"
+                # experiences_path = runner.experiences_path  # 已废弃,使用知识系统 or "./.cache/experiences_find.md"
                 os.makedirs(os.path.dirname(experiences_path), exist_ok=True)
                 os.makedirs(os.path.dirname(experiences_path), exist_ok=True)
                 header = f"\n\n---\n\n## {trace_id} ({datetime.now().strftime('%Y-%m-%d %H:%M')})\n\n"
                 header = f"\n\n---\n\n## {trace_id} ({datetime.now().strftime('%Y-%m-%d %H:%M')})\n\n"
                 with open(experiences_path, "a", encoding="utf-8") as f:
                 with open(experiences_path, "a", encoding="utf-8") as f:
@@ -319,7 +319,6 @@ async def main():
         trace_store=store,
         trace_store=store,
         llm_call=create_openrouter_llm_call(model=f"anthropic/claude-{prompt.config.get('model', 'sonnet-4.5')}"),
         llm_call=create_openrouter_llm_call(model=f"anthropic/claude-{prompt.config.get('model', 'sonnet-4.5')}"),
         skills_dir=skills_dir,
         skills_dir=skills_dir,
-        experiences_path="./.cache/experiences_find.md",
         debug=True
         debug=True
     )
     )
 
 

+ 1 - 2
examples/how/run.py

@@ -236,7 +236,7 @@ async def perform_reflection(runner: AgentRunner, store: FileSystemTraceStore, t
 
 
     # 追加到 experiences 文件
     # 追加到 experiences 文件
     if reflection_text:
     if reflection_text:
-        experiences_path = runner.experiences_path or "./.cache/experiences_how.md"
+        # experiences_path = runner.experiences_path  # 已废弃,使用知识系统 or "./.cache/experiences_how.md"
         os.makedirs(os.path.dirname(experiences_path), exist_ok=True)
         os.makedirs(os.path.dirname(experiences_path), exist_ok=True)
 
 
         pattern = r"-\s*\[(?P<tags>.*?)\]\s*(?P<content>.*)"
         pattern = r"-\s*\[(?P<tags>.*?)\]\s*(?P<content>.*)"
@@ -337,7 +337,6 @@ async def main():
         trace_store=store,
         trace_store=store,
         llm_call=create_openrouter_llm_call(model=f"anthropic/claude-{prompt.config.get('model', 'sonnet-4.5')}"),
         llm_call=create_openrouter_llm_call(model=f"anthropic/claude-{prompt.config.get('model', 'sonnet-4.5')}"),
         skills_dir=skills_dir,
         skills_dir=skills_dir,
-        experiences_path="./.cache/experiences_how.md",
         debug=True
         debug=True
     )
     )
 
 

+ 1 - 2
examples/research/run.py

@@ -236,7 +236,7 @@ async def perform_reflection(runner: AgentRunner, store: FileSystemTraceStore, t
 
 
     # 追加到 experiences 文件
     # 追加到 experiences 文件
     if reflection_text:
     if reflection_text:
-        experiences_path = runner.experiences_path or "./.cache/experiences_how.md"
+        # experiences_path = runner.experiences_path  # 已废弃,使用知识系统 or "./.cache/experiences_how.md"
         os.makedirs(os.path.dirname(experiences_path), exist_ok=True)
         os.makedirs(os.path.dirname(experiences_path), exist_ok=True)
 
 
         pattern = r"-\s*\[(?P<tags>.*?)\]\s*(?P<content>.*)"
         pattern = r"-\s*\[(?P<tags>.*?)\]\s*(?P<content>.*)"
@@ -337,7 +337,6 @@ async def main():
         trace_store=store,
         trace_store=store,
         llm_call=create_openrouter_llm_call(model=f"anthropic/claude-{prompt.config.get('model', 'sonnet-4.5')}"),
         llm_call=create_openrouter_llm_call(model=f"anthropic/claude-{prompt.config.get('model', 'sonnet-4.5')}"),
         skills_dir=skills_dir,
         skills_dir=skills_dir,
-        experiences_path="./.cache/experiences_how.md",
         debug=True
         debug=True
     )
     )
 
 

+ 1 - 2
examples/restore/run.py

@@ -248,7 +248,7 @@ async def perform_reflection(runner: AgentRunner, store: FileSystemTraceStore, t
 
 
     # 追加到 experiences 文件
     # 追加到 experiences 文件
     if reflection_text:
     if reflection_text:
-        experiences_path = runner.experiences_path or "./.cache/experiences_restore.md"
+        # experiences_path = runner.experiences_path  # 已废弃,使用知识系统 or "./.cache/experiences_restore.md"
         os.makedirs(os.path.dirname(experiences_path), exist_ok=True)
         os.makedirs(os.path.dirname(experiences_path), exist_ok=True)
 
 
         pattern = r"-\s*\[(?P<tags>.*?)\]\s*(?P<content>.*)"
         pattern = r"-\s*\[(?P<tags>.*?)\]\s*(?P<content>.*)"
@@ -349,7 +349,6 @@ async def main():
         trace_store=store,
         trace_store=store,
         llm_call=create_openrouter_llm_call(model=f"anthropic/claude-{prompt.config.get('model', 'sonnet-4.5')}"),
         llm_call=create_openrouter_llm_call(model=f"anthropic/claude-{prompt.config.get('model', 'sonnet-4.5')}"),
         skills_dir=skills_dir,
         skills_dir=skills_dir,
-        experiences_path="./.cache/experiences_restore.md",
         debug=True
         debug=True
     )
     )
 
 

+ 1 - 2
examples/restore_old/run.py

@@ -248,7 +248,7 @@ async def perform_reflection(runner: AgentRunner, store: FileSystemTraceStore, t
 
 
     # 追加到 experiences 文件
     # 追加到 experiences 文件
     if reflection_text:
     if reflection_text:
-        experiences_path = runner.experiences_path or "./.cache/experiences_restore.md"
+        # experiences_path = runner.experiences_path  # 已废弃,使用知识系统 or "./.cache/experiences_restore.md"
         os.makedirs(os.path.dirname(experiences_path), exist_ok=True)
         os.makedirs(os.path.dirname(experiences_path), exist_ok=True)
 
 
         pattern = r"-\s*\[(?P<tags>.*?)\]\s*(?P<content>.*)"
         pattern = r"-\s*\[(?P<tags>.*?)\]\s*(?P<content>.*)"
@@ -349,7 +349,6 @@ async def main():
         trace_store=store,
         trace_store=store,
         llm_call=create_openrouter_llm_call(model=f"anthropic/claude-{prompt.config.get('model', 'sonnet-4.5')}"),
         llm_call=create_openrouter_llm_call(model=f"anthropic/claude-{prompt.config.get('model', 'sonnet-4.5')}"),
         skills_dir=skills_dir,
         skills_dir=skills_dir,
-        experiences_path="./.cache/experiences_restore.md",
         debug=True
         debug=True
     )
     )
 
 

+ 1 - 1
examples/tool_research/run.py

@@ -154,7 +154,7 @@ async def show_interactive_menu(
             # 追加到 experiences 文件
             # 追加到 experiences 文件
             if reflection_text:
             if reflection_text:
                 from datetime import datetime
                 from datetime import datetime
-                experiences_path = runner.experiences_path or "./.cache/experiences.md"
+                # experiences_path = runner.experiences_path  # 已废弃,使用知识系统 or "./.cache/experiences.md"
                 os.makedirs(os.path.dirname(experiences_path), exist_ok=True)
                 os.makedirs(os.path.dirname(experiences_path), exist_ok=True)
                 header = f"\n\n---\n\n## {trace_id} ({datetime.now().strftime('%Y-%m-%d %H:%M')})\n\n"
                 header = f"\n\n---\n\n## {trace_id} ({datetime.now().strftime('%Y-%m-%d %H:%M')})\n\n"
                 with open(experiences_path, "a", encoding="utf-8") as f:
                 with open(experiences_path, "a", encoding="utf-8") as f:

+ 103 - 47
knowhub/docs/knowledge-management.md

@@ -24,35 +24,40 @@ Agent                           KnowHub Server
 
 
 ## 知识结构
 ## 知识结构
 
 
-单条知识的数据格式:
+单条知识的数据格式(基于 `agent/docs/knowledge.md` 定义)
 
 
 ```json
 ```json
 {
 {
   "id": "knowledge-20260305-a1b2",
   "id": "knowledge-20260305-a1b2",
   "message_id": "msg-xxx",
   "message_id": "msg-xxx",
+  "types": ["strategy", "tool"],
+  "task": "在什么场景下要完成什么目标",
   "tags": {
   "tags": {
-    "type": ["tool", "usecase", "definition", "plan", "strategy"]
+    "category": "preference",
+    "domain": "coding_style"
   },
   },
-  "scenario": "在什么场景下要完成什么目标",
+  "scopes": ["org:cybertogether"],
+  "owner": "agent:research_agent",
   "content": "核心知识内容",
   "content": "核心知识内容",
   "source": {
   "source": {
+    "name": "资源名称",
+    "category": "exp",
     "urls": ["https://example.com"],
     "urls": ["https://example.com"],
     "agent_id": "research_agent",
     "agent_id": "research_agent",
-    "timestamp": "2026-03-05T12:00:00"
+    "submitted_by": "user@example.com",
+    "timestamp": "2026-03-05T12:00:00Z",
+    "message_id": "msg-xxx"
   },
   },
   "eval": {
   "eval": {
-    "score": 3,
-    "helpful": 0,
+    "score": 4,
+    "helpful": 5,
     "harmful": 0,
     "harmful": 0,
+    "confidence": 0.9,
     "helpful_history": [],
     "helpful_history": [],
     "harmful_history": []
     "harmful_history": []
   },
   },
-  "metrics": {
-    "helpful": 1,
-    "harmful": 0
-  },
-  "created_at": "2026-03-05 12:00:00",
-  "updated_at": "2026-03-05 12:00:00"
+  "created_at": "2026-03-05T12:00:00Z",
+  "updated_at": "2026-03-05T12:00:00Z"
 }
 }
 ```
 ```
 
 
@@ -60,20 +65,33 @@ Agent                           KnowHub Server
 
 
 - **id**: 唯一标识,格式 `knowledge-{timestamp}-{random}`
 - **id**: 唯一标识,格式 `knowledge-{timestamp}-{random}`
 - **message_id**: 来源 Message ID(用于精确溯源到具体消息)
 - **message_id**: 来源 Message ID(用于精确溯源到具体消息)
-- **tags.type**: 知识类型(可多选)
+- **types**: 知识类型数组(可多选)
+  - `user_profile`: 用户偏好、习惯、背景
+  - `strategy`: 执行经验(从反思中获得)
   - `tool`: 工具使用方法、优缺点、代码示例
   - `tool`: 工具使用方法、优缺点、代码示例
   - `usecase`: 用户背景、方案、步骤、效果
   - `usecase`: 用户背景、方案、步骤、效果
   - `definition`: 概念定义、技术原理、应用场景
   - `definition`: 概念定义、技术原理、应用场景
   - `plan`: 流程步骤、决策点、方法论
   - `plan`: 流程步骤、决策点、方法论
-  - `strategy`: 执行经验(从反思中获得)
-- **scenario**: 任务描述,什么场景、在做什么
+- **task**: 任务描述,什么场景、在做什么
+- **tags**: 业务标签(JSON 对象),如 `{"category": "preference", "domain": "coding_style"}`
+- **scopes**: 可见范围数组,如 `["org:cybertogether"]`
+- **owner**: 所有者,格式 `agent:{agent_id}`
 - **content**: 核心知识内容
 - **content**: 核心知识内容
-- **source.urls**: 参考来源链接
-- **source.agent_id**: 创建者 agent ID
-- **source.timestamp**: 创建时间戳
-- **eval.score**: 初始评分 1-5
-- **eval.helpful/harmful**: 好用/不好用次数
-- **metrics.helpful/harmful**: 累计反馈次数
+- **source**: 来源信息(嵌套对象)
+  - **name**: 资源名称
+  - **category**: 来源类别(paper/exp/skill/book)
+  - **urls**: 参考来源链接数组
+  - **agent_id**: 创建者 agent ID
+  - **submitted_by**: 提交者
+  - **timestamp**: 创建时间戳
+  - **message_id**: 可选,用于溯源
+- **eval**: 评估信息(嵌套对象)
+  - **score**: 评分 1-5
+  - **helpful**: 好用次数
+  - **harmful**: 不好用次数
+  - **confidence**: 置信度 0-1
+  - **helpful_history**: 好用案例历史
+  - **harmful_history**: 不好用案例历史
 
 
 ---
 ---
 
 
@@ -93,11 +111,11 @@ async def knowledge_search(
     query: str,
     query: str,
     top_k: int = 5,
     top_k: int = 5,
     min_score: int = 3,
     min_score: int = 3,
-    tags_type: Optional[List[str]] = None
+    types: Optional[List[str]] = None
 ) -> ToolResult
 ) -> ToolResult
 ```
 ```
 
 
-调用 `GET /api/knowledge/search?q={query}&top_k={top_k}&min_score={min_score}`
+调用 `GET /api/knowledge/search?q={query}&top_k={top_k}&min_score={min_score}&types={types}`
 
 
 ### `knowledge_save`
 ### `knowledge_save`
 
 
@@ -106,11 +124,17 @@ async def knowledge_search(
 ```python
 ```python
 @tool()
 @tool()
 async def knowledge_save(
 async def knowledge_save(
-    scenario: str,
+    task: str,
     content: str,
     content: str,
-    tags_type: List[str],
+    types: List[str],
+    tags: Optional[Dict[str, str]] = None,
+    scopes: Optional[List[str]] = None,
+    owner: Optional[str] = None,
+    source_name: str = "",
+    source_category: str = "exp",
     urls: List[str] = None,
     urls: List[str] = None,
     agent_id: str = "research_agent",
     agent_id: str = "research_agent",
+    submitted_by: str = "",
     score: int = 3,
     score: int = 3,
     message_id: str = ""
     message_id: str = ""
 ) -> ToolResult
 ) -> ToolResult
@@ -118,6 +142,10 @@ async def knowledge_save(
 
 
 调用 `POST /api/knowledge`
 调用 `POST /api/knowledge`
 
 
+**默认值**(在 agent 代码中设置):
+- `scopes`: `["org:cybertogether"]`
+- `owner`: `f"agent:{agent_id}"`
+
 ### `knowledge_update`
 ### `knowledge_update`
 
 
 更新已有知识的评估反馈。
 更新已有知识的评估反馈。
@@ -158,11 +186,12 @@ async def knowledge_batch_update(
 @tool()
 @tool()
 async def knowledge_list(
 async def knowledge_list(
     limit: int = 10,
     limit: int = 10,
-    tags_type: Optional[List[str]] = None
+    types: Optional[List[str]] = None,
+    scopes: Optional[List[str]] = None
 ) -> ToolResult
 ) -> ToolResult
 ```
 ```
 
 
-调用 `GET /api/knowledge?limit={limit}`
+调用 `GET /api/knowledge?limit={limit}&types={types}&scopes={scopes}`
 
 
 ### `knowledge_slim`
 ### `knowledge_slim`
 
 
@@ -171,7 +200,7 @@ async def knowledge_list(
 ```python
 ```python
 @tool()
 @tool()
 async def knowledge_slim(
 async def knowledge_slim(
-    model: str = "anthropic/claude-sonnet-4.5"
+    model: str = "google/gemini-2.0-flash-001"
 ) -> ToolResult
 ) -> ToolResult
 ```
 ```
 
 
@@ -271,12 +300,12 @@ return ToolResult(
 - `q`: 查询文本
 - `q`: 查询文本
 - `top_k`: 返回数量(默认 5)
 - `top_k`: 返回数量(默认 5)
 - `min_score`: 最低评分过滤(默认 3)
 - `min_score`: 最低评分过滤(默认 3)
-- `tags_type`: 按类型过滤(可选)
+- `types`: 按类型过滤(可选,逗号分隔
 
 
 **检索流程**(两阶段,Server 端实现):
 **检索流程**(两阶段,Server 端实现):
 
 
 1. **语义路由**:使用 LLM(gemini-2.0-flash-001)从所有知识中挑选 2*k 个语义相关的候选
 1. **语义路由**:使用 LLM(gemini-2.0-flash-001)从所有知识中挑选 2*k 个语义相关的候选
-   - 输入:query + 知识元数据(id, tags, scenario 前 100 字符)
+   - 输入:query + 知识元数据(id, types, task 前 100 字符)
    - 输出:候选知识 ID 列表
    - 输出:候选知识 ID 列表
 
 
 2. **质量精排**:根据评分和反馈计算质量分,筛选最终的 k 个
 2. **质量精排**:根据评分和反馈计算质量分,筛选最终的 k 个
@@ -292,12 +321,17 @@ return ToolResult(
   "results": [
   "results": [
     {
     {
       "id": "knowledge-xxx",
       "id": "knowledge-xxx",
-      "scenario": "...",
+      "task": "...",
       "content": "...",
       "content": "...",
-      "tags": {...},
-      "score": 4,
-      "quality_score": 5.0,
-      "metrics": {"helpful": 2, "harmful": 0}
+      "types": ["strategy", "tool"],
+      "tags": {"category": "preference"},
+      "eval": {
+        "score": 4,
+        "helpful": 2,
+        "harmful": 0,
+        "confidence": 0.9
+      },
+      "quality_score": 5.0
     }
     }
   ],
   ],
   "count": 3
   "count": 3
@@ -312,13 +346,21 @@ return ToolResult(
 
 
 ```json
 ```json
 {
 {
-  "scenario": "在什么场景下要完成什么目标",
+  "task": "在什么场景下要完成什么目标",
   "content": "核心知识内容",
   "content": "核心知识内容",
-  "tags_type": ["tool", "strategy"],
-  "urls": ["https://example.com"],
-  "agent_id": "research_agent",
-  "score": 4,
-  "message_id": "msg-xxx"
+  "types": ["tool", "strategy"],
+  "tags": {"category": "preference", "domain": "coding_style"},
+  "scopes": ["org:cybertogether"],
+  "owner": "agent:research_agent",
+  "source": {
+    "name": "资源名称",
+    "category": "exp",
+    "urls": ["https://example.com"],
+    "agent_id": "research_agent",
+    "submitted_by": "user@example.com",
+    "message_id": "msg-xxx"
+  },
+  "score": 4
 }
 }
 ```
 ```
 
 
@@ -332,8 +374,17 @@ return ToolResult(
 
 
 ```json
 ```json
 {
 {
-  "add_helpful_case": {"case_id": "...", "scenario": "...", "result": "..."},
-  "add_harmful_case": {"case_id": "...", "scenario": "...", "result": "..."},
+  "add_helpful_case": {
+    "task": "任务描述",
+    "outcome": "成功",
+    "timestamp": "2026-03-05T12:00:00Z"
+  },
+  "add_harmful_case": {
+    "task": "任务描述",
+    "outcome": "失败",
+    "reason": "原因",
+    "timestamp": "2026-03-05T12:00:00Z"
+  },
   "update_score": 4,
   "update_score": 4,
   "evolve_feedback": "改进建议(触发知识进化)"
   "evolve_feedback": "改进建议(触发知识进化)"
 }
 }
@@ -354,8 +405,12 @@ return ToolResult(
   "feedback_list": [
   "feedback_list": [
     {
     {
       "knowledge_id": "knowledge-xxx",
       "knowledge_id": "knowledge-xxx",
-      "is_effective": true,
-      "feedback": "改进建议(可选)"
+      "is_helpful": true,
+      "case": {
+        "task": "任务描述",
+        "outcome": "成功",
+        "timestamp": "2026-03-05T12:00:00Z"
+      }
     }
     }
   ]
   ]
 }
 }
@@ -369,7 +424,8 @@ return ToolResult(
 
 
 **参数**:
 **参数**:
 - `limit`: 返回数量(默认 10)
 - `limit`: 返回数量(默认 10)
-- `tags_type`: 按类型过滤(可选)
+- `types`: 按类型过滤(可选,逗号分隔)
+- `scopes`: 按可见范围过滤(可选,逗号分隔)
 
 
 实现位置:`knowhub/server.py:list_knowledge`
 实现位置:`knowhub/server.py:list_knowledge`
 
 
@@ -381,7 +437,7 @@ return ToolResult(
 
 
 ```json
 ```json
 {
 {
-  "model": "anthropic/claude-sonnet-4.5"
+  "model": "google/gemini-2.0-flash-001"
 }
 }
 ```
 ```
 
 

+ 207 - 169
knowhub/server.py

@@ -77,25 +77,22 @@ def init_db():
         CREATE TABLE IF NOT EXISTS knowledge (
         CREATE TABLE IF NOT EXISTS knowledge (
             id            TEXT PRIMARY KEY,
             id            TEXT PRIMARY KEY,
             message_id    TEXT DEFAULT '',
             message_id    TEXT DEFAULT '',
-            tags_type     TEXT NOT NULL,
-            scenario      TEXT NOT NULL,
+            types         TEXT NOT NULL,              -- JSON array: ["strategy", "tool"]
+            task          TEXT NOT NULL,
+            tags          TEXT DEFAULT '{}',          -- JSON object: {"category": "...", "domain": "..."}
+            scopes        TEXT DEFAULT '["org:cybertogether"]',  -- JSON array
+            owner         TEXT DEFAULT '',
             content       TEXT NOT NULL,
             content       TEXT NOT NULL,
-            source_urls   TEXT DEFAULT '',
-            source_agent_id TEXT DEFAULT '',
-            source_timestamp TEXT NOT NULL,
-            eval_score    INTEGER DEFAULT 3 CHECK(eval_score BETWEEN 1 AND 5),
-            eval_helpful  INTEGER DEFAULT 0,
-            eval_harmful  INTEGER DEFAULT 0,
-            eval_helpful_history TEXT DEFAULT '[]',
-            eval_harmful_history TEXT DEFAULT '[]',
-            metrics_helpful INTEGER DEFAULT 1,
-            metrics_harmful INTEGER DEFAULT 0,
+            source        TEXT DEFAULT '{}',          -- JSON object: {name, category, urls, agent_id, submitted_by, timestamp}
+            eval          TEXT DEFAULT '{}',          -- JSON object: {score, helpful, harmful, confidence, histories}
             created_at    TEXT NOT NULL,
             created_at    TEXT NOT NULL,
             updated_at    TEXT DEFAULT ''
             updated_at    TEXT DEFAULT ''
         )
         )
     """)
     """)
-    conn.execute("CREATE INDEX IF NOT EXISTS idx_knowledge_tags ON knowledge(tags_type)")
-    conn.execute("CREATE INDEX IF NOT EXISTS idx_knowledge_scenario ON knowledge(scenario)")
+    conn.execute("CREATE INDEX IF NOT EXISTS idx_knowledge_types ON knowledge(types)")
+    conn.execute("CREATE INDEX IF NOT EXISTS idx_knowledge_task ON knowledge(task)")
+    conn.execute("CREATE INDEX IF NOT EXISTS idx_knowledge_owner ON knowledge(owner)")
+    conn.execute("CREATE INDEX IF NOT EXISTS idx_knowledge_scopes ON knowledge(scopes)")
 
 
     conn.commit()
     conn.commit()
     conn.close()
     conn.close()
@@ -156,24 +153,28 @@ class ContentIn(BaseModel):
 
 
 # Knowledge Models
 # Knowledge Models
 class KnowledgeIn(BaseModel):
 class KnowledgeIn(BaseModel):
-    scenario: str
+    task: str
     content: str
     content: str
-    tags_type: list[str]
-    urls: list[str] = []
-    agent_id: str = "research_agent"
-    score: int = Field(default=3, ge=1, le=5)
+    types: list[str] = ["strategy"]
+    tags: dict = {}
+    scopes: list[str] = ["org:cybertogether"]
+    owner: str = ""
     message_id: str = ""
     message_id: str = ""
+    source: dict = {}  # {name, category, urls, agent_id, submitted_by, timestamp}
+    eval: dict = {}    # {score, helpful, harmful, confidence}
 
 
 
 
 class KnowledgeOut(BaseModel):
 class KnowledgeOut(BaseModel):
     id: str
     id: str
     message_id: str
     message_id: str
+    types: list[str]
+    task: str
     tags: dict
     tags: dict
-    scenario: str
+    scopes: list[str]
+    owner: str
     content: str
     content: str
     source: dict
     source: dict
     eval: dict
     eval: dict
-    metrics: dict
     created_at: str
     created_at: str
     updated_at: str
     updated_at: str
 
 
@@ -448,8 +449,8 @@ async def _route_knowledge_by_llm(query_text: str, metadata_list: list[dict], k:
     routing_data = [
     routing_data = [
         {
         {
             "id": m["id"],
             "id": m["id"],
-            "tags": m["tags"],
-            "scenario": m["scenario"][:100]
+            "types": m["types"],
+            "task": m["task"][:100]
         } for m in metadata_list
         } for m in metadata_list
     ]
     ]
 
 
@@ -485,7 +486,7 @@ async def _search_knowledge_two_stage(
     query_text: str,
     query_text: str,
     top_k: int = 5,
     top_k: int = 5,
     min_score: int = 3,
     min_score: int = 3,
-    tags_filter: Optional[list[str]] = None,
+    types_filter: Optional[list[str]] = None,
     conn: sqlite3.Connection = None
     conn: sqlite3.Connection = None
 ) -> list[dict]:
 ) -> list[dict]:
     """
     """
@@ -510,38 +511,40 @@ async def _search_knowledge_two_stage(
 
 
         for row in rows:
         for row in rows:
             kid = row["id"]
             kid = row["id"]
-            tags_type = row["tags_type"].split(",") if row["tags_type"] else []
+            types = json.loads(row["types"])
 
 
             # 标签过滤
             # 标签过滤
-            if tags_filter:
-                if not any(tag in tags_type for tag in tags_filter):
+            if types_filter:
+                if not any(t in types for t in types_filter):
                     continue
                     continue
 
 
-            scenario = row["scenario"]
+            task = row["task"]
             content_text = row["content"]
             content_text = row["content"]
+            eval_data = json.loads(row["eval"])
+            source = json.loads(row["source"])
 
 
             meta_item = {
             meta_item = {
                 "id": kid,
                 "id": kid,
-                "tags": {"type": tags_type},
-                "scenario": scenario,
-                "score": row["eval_score"],
-                "helpful": row["metrics_helpful"],
-                "harmful": row["metrics_harmful"],
+                "types": types,
+                "task": task,
+                "score": eval_data.get("score", 3),
+                "helpful": eval_data.get("helpful", 0),
+                "harmful": eval_data.get("harmful", 0),
             }
             }
             metadata_list.append(meta_item)
             metadata_list.append(meta_item)
             content_map[kid] = {
             content_map[kid] = {
-                "scenario": scenario,
+                "task": task,
                 "content": content_text,
                 "content": content_text,
-                "tags": {"type": tags_type},
+                "types": types,
+                "tags": json.loads(row["tags"]),
+                "scopes": json.loads(row["scopes"]),
+                "owner": row["owner"],
                 "score": meta_item["score"],
                 "score": meta_item["score"],
                 "helpful": meta_item["helpful"],
                 "helpful": meta_item["helpful"],
                 "harmful": meta_item["harmful"],
                 "harmful": meta_item["harmful"],
                 "message_id": row["message_id"],
                 "message_id": row["message_id"],
-                "source": {
-                    "urls": row["source_urls"].split(",") if row["source_urls"] else [],
-                    "agent_id": row["source_agent_id"],
-                    "timestamp": row["source_timestamp"]
-                },
+                "source": source,
+                "eval": eval_data,
                 "created_at": row["created_at"],
                 "created_at": row["created_at"],
                 "updated_at": row["updated_at"]
                 "updated_at": row["updated_at"]
             }
             }
@@ -574,16 +577,15 @@ async def _search_knowledge_two_stage(
                 scored_items.append({
                 scored_items.append({
                     "id": kid,
                     "id": kid,
                     "message_id": item["message_id"],
                     "message_id": item["message_id"],
-                    "scenario": item["scenario"],
-                    "content": item["content"],
+                    "types": item["types"],
+                    "task": item["task"],
                     "tags": item["tags"],
                     "tags": item["tags"],
-                    "score": score,
-                    "quality_score": quality_score,
-                    "metrics": {
-                        "helpful": helpful,
-                        "harmful": harmful
-                    },
+                    "scopes": item["scopes"],
+                    "owner": item["owner"],
+                    "content": item["content"],
                     "source": item["source"],
                     "source": item["source"],
+                    "eval": item["eval"],
+                    "quality_score": quality_score,
                     "created_at": item["created_at"],
                     "created_at": item["created_at"],
                     "updated_at": item["updated_at"]
                     "updated_at": item["updated_at"]
                 })
                 })
@@ -608,18 +610,18 @@ async def search_knowledge_api(
     q: str = Query(..., description="查询文本"),
     q: str = Query(..., description="查询文本"),
     top_k: int = Query(default=5, ge=1, le=20),
     top_k: int = Query(default=5, ge=1, le=20),
     min_score: int = Query(default=3, ge=1, le=5),
     min_score: int = Query(default=3, ge=1, le=5),
-    tags_type: Optional[str] = None
+    types: Optional[str] = None
 ):
 ):
     """检索知识(两阶段:语义路由 + 质量精排)"""
     """检索知识(两阶段:语义路由 + 质量精排)"""
     conn = get_db()
     conn = get_db()
     try:
     try:
-        tags_filter = tags_type.split(",") if tags_type else None
+        types_filter = types.split(",") if types else None
 
 
         results = await _search_knowledge_two_stage(
         results = await _search_knowledge_two_stage(
             query_text=q,
             query_text=q,
             top_k=top_k,
             top_k=top_k,
             min_score=min_score,
             min_score=min_score,
-            tags_filter=tags_filter,
+            types_filter=types_filter,
             conn=conn
             conn=conn
         )
         )
 
 
@@ -641,30 +643,46 @@ def save_knowledge(knowledge: KnowledgeIn):
 
 
         now = datetime.now(timezone.utc).isoformat()
         now = datetime.now(timezone.utc).isoformat()
 
 
+        # 设置默认值
+        owner = knowledge.owner or f"agent:{knowledge.source.get('agent_id', 'unknown')}"
+
+        # 准备 source
+        source = {
+            "name": knowledge.source.get("name", ""),
+            "category": knowledge.source.get("category", ""),
+            "urls": knowledge.source.get("urls", []),
+            "agent_id": knowledge.source.get("agent_id", "unknown"),
+            "submitted_by": knowledge.source.get("submitted_by", ""),
+            "timestamp": now,
+            "message_id": knowledge.message_id
+        }
+
+        # 准备 eval
+        eval_data = {
+            "score": knowledge.eval.get("score", 3),
+            "helpful": knowledge.eval.get("helpful", 1),
+            "harmful": knowledge.eval.get("harmful", 0),
+            "confidence": knowledge.eval.get("confidence", 0.5),
+            "helpful_history": [],
+            "harmful_history": []
+        }
+
         conn.execute(
         conn.execute(
             """INSERT INTO knowledge
             """INSERT INTO knowledge
-            (id, message_id, tags_type, scenario, content,
-             source_urls, source_agent_id, source_timestamp,
-             eval_score, eval_helpful, eval_harmful,
-             eval_helpful_history, eval_harmful_history,
-             metrics_helpful, metrics_harmful, created_at, updated_at)
-            VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)""",
+            (id, message_id, types, task, tags, scopes, owner, content,
+             source, eval, created_at, updated_at)
+            VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)""",
             (
             (
                 knowledge_id,
                 knowledge_id,
                 knowledge.message_id,
                 knowledge.message_id,
-                ",".join(knowledge.tags_type),
-                knowledge.scenario,
+                json.dumps(knowledge.types),
+                knowledge.task,
+                json.dumps(knowledge.tags),
+                json.dumps(knowledge.scopes),
+                owner,
                 knowledge.content,
                 knowledge.content,
-                ",".join(knowledge.urls),
-                knowledge.agent_id,
-                now,
-                knowledge.score,
-                0,  # eval_helpful
-                0,  # eval_harmful
-                "[]",  # eval_helpful_history
-                "[]",  # eval_harmful_history
-                1,  # metrics_helpful
-                0,  # metrics_harmful
+                json.dumps(source),
+                json.dumps(eval_data),
                 now,
                 now,
                 now,
                 now,
             ),
             ),
@@ -678,17 +696,26 @@ def save_knowledge(knowledge: KnowledgeIn):
 @app.get("/api/knowledge")
 @app.get("/api/knowledge")
 def list_knowledge(
 def list_knowledge(
     limit: int = Query(default=10, ge=1, le=100),
     limit: int = Query(default=10, ge=1, le=100),
-    tags_type: Optional[str] = None
+    types: Optional[str] = None,
+    scopes: Optional[str] = None
 ):
 ):
     """列出知识"""
     """列出知识"""
     conn = get_db()
     conn = get_db()
     try:
     try:
         query = "SELECT * FROM knowledge"
         query = "SELECT * FROM knowledge"
         params = []
         params = []
+        conditions = []
+
+        if types:
+            conditions.append("types LIKE ?")
+            params.append(f"%{types}%")
 
 
-        if tags_type:
-            query += " WHERE tags_type LIKE ?"
-            params.append(f"%{tags_type}%")
+        if scopes:
+            conditions.append("scopes LIKE ?")
+            params.append(f"%{scopes}%")
+
+        if conditions:
+            query += " WHERE " + " AND ".join(conditions)
 
 
         query += " ORDER BY created_at DESC LIMIT ?"
         query += " ORDER BY created_at DESC LIMIT ?"
         params.append(limit)
         params.append(limit)
@@ -700,23 +727,14 @@ def list_knowledge(
             results.append({
             results.append({
                 "id": row["id"],
                 "id": row["id"],
                 "message_id": row["message_id"],
                 "message_id": row["message_id"],
-                "tags": {"type": row["tags_type"].split(",") if row["tags_type"] else []},
-                "scenario": row["scenario"],
+                "types": json.loads(row["types"]),
+                "task": row["task"],
+                "tags": json.loads(row["tags"]),
+                "scopes": json.loads(row["scopes"]),
+                "owner": row["owner"],
                 "content": row["content"],
                 "content": row["content"],
-                "source": {
-                    "urls": row["source_urls"].split(",") if row["source_urls"] else [],
-                    "agent_id": row["source_agent_id"],
-                    "timestamp": row["source_timestamp"]
-                },
-                "eval": {
-                    "score": row["eval_score"],
-                    "helpful": row["eval_helpful"],
-                    "harmful": row["eval_harmful"]
-                },
-                "metrics": {
-                    "helpful": row["metrics_helpful"],
-                    "harmful": row["metrics_harmful"]
-                },
+                "source": json.loads(row["source"]),
+                "eval": json.loads(row["eval"]),
                 "created_at": row["created_at"],
                 "created_at": row["created_at"],
                 "updated_at": row["updated_at"]
                 "updated_at": row["updated_at"]
             })
             })
@@ -742,25 +760,14 @@ def get_knowledge(knowledge_id: str):
         return {
         return {
             "id": row["id"],
             "id": row["id"],
             "message_id": row["message_id"],
             "message_id": row["message_id"],
-            "tags": {"type": row["tags_type"].split(",") if row["tags_type"] else []},
-            "scenario": row["scenario"],
+            "types": json.loads(row["types"]),
+            "task": row["task"],
+            "tags": json.loads(row["tags"]),
+            "scopes": json.loads(row["scopes"]),
+            "owner": row["owner"],
             "content": row["content"],
             "content": row["content"],
-            "source": {
-                "urls": row["source_urls"].split(",") if row["source_urls"] else [],
-                "agent_id": row["source_agent_id"],
-                "timestamp": row["source_timestamp"]
-            },
-            "eval": {
-                "score": row["eval_score"],
-                "helpful": row["eval_helpful"],
-                "harmful": row["eval_harmful"],
-                "helpful_history": [],
-                "harmful_history": []
-            },
-            "metrics": {
-                "helpful": row["metrics_helpful"],
-                "harmful": row["metrics_harmful"]
-            },
+            "source": json.loads(row["source"]),
+            "eval": json.loads(row["eval"]),
             "created_at": row["created_at"],
             "created_at": row["created_at"],
             "updated_at": row["updated_at"]
             "updated_at": row["updated_at"]
         }
         }
@@ -808,33 +815,37 @@ async def update_knowledge(knowledge_id: str, update: KnowledgeUpdateIn):
             raise HTTPException(status_code=404, detail=f"Knowledge not found: {knowledge_id}")
             raise HTTPException(status_code=404, detail=f"Knowledge not found: {knowledge_id}")
 
 
         now = datetime.now(timezone.utc).isoformat()
         now = datetime.now(timezone.utc).isoformat()
-        updates = {"updated_at": now}
+        eval_data = json.loads(row["eval"])
 
 
+        # 更新评分
         if update.update_score is not None:
         if update.update_score is not None:
-            updates["eval_score"] = update.update_score
+            eval_data["score"] = update.update_score
 
 
+        # 添加有效案例
         if update.add_helpful_case:
         if update.add_helpful_case:
-            helpful_history = json.loads(row["eval_helpful_history"] or "[]")
-            helpful_history.append(update.add_helpful_case)
-            updates["eval_helpful"] = row["eval_helpful"] + 1
-            updates["eval_helpful_history"] = json.dumps(helpful_history, ensure_ascii=False)
-            updates["metrics_helpful"] = row["metrics_helpful"] + 1
+            eval_data["helpful"] = eval_data.get("helpful", 0) + 1
+            if "helpful_history" not in eval_data:
+                eval_data["helpful_history"] = []
+            eval_data["helpful_history"].append(update.add_helpful_case)
 
 
+        # 添加有害案例
         if update.add_harmful_case:
         if update.add_harmful_case:
-            harmful_history = json.loads(row["eval_harmful_history"] or "[]")
-            harmful_history.append(update.add_harmful_case)
-            updates["eval_harmful"] = row["eval_harmful"] + 1
-            updates["eval_harmful_history"] = json.dumps(harmful_history, ensure_ascii=False)
-            updates["metrics_harmful"] = row["metrics_harmful"] + 1
+            eval_data["harmful"] = eval_data.get("harmful", 0) + 1
+            if "harmful_history" not in eval_data:
+                eval_data["harmful_history"] = []
+            eval_data["harmful_history"].append(update.add_harmful_case)
 
 
+        # 知识进化
+        content = row["content"]
         if update.evolve_feedback:
         if update.evolve_feedback:
-            evolved_content = await _evolve_knowledge_with_llm(row["content"], update.evolve_feedback)
-            updates["content"] = evolved_content
-            updates["metrics_helpful"] = updates.get("metrics_helpful", row["metrics_helpful"]) + 1
+            content = await _evolve_knowledge_with_llm(content, update.evolve_feedback)
+            eval_data["helpful"] = eval_data.get("helpful", 0) + 1
 
 
-        set_clause = ", ".join(f"{k} = ?" for k in updates)
-        values = list(updates.values()) + [knowledge_id]
-        conn.execute(f"UPDATE knowledge SET {set_clause} WHERE id = ?", values)
+        # 更新数据库
+        conn.execute(
+            "UPDATE knowledge SET content = ?, eval = ?, updated_at = ? WHERE id = ?",
+            (content, json.dumps(eval_data, ensure_ascii=False), now, knowledge_id)
+        )
         conn.commit()
         conn.commit()
 
 
         return {"status": "ok", "knowledge_id": knowledge_id}
         return {"status": "ok", "knowledge_id": knowledge_id}
@@ -851,8 +862,8 @@ async def batch_update_knowledge(batch: KnowledgeBatchUpdateIn):
     conn = get_db()
     conn = get_db()
     try:
     try:
         # 先处理无需进化的,收集需要进化的
         # 先处理无需进化的,收集需要进化的
-        evolution_tasks = []   # [(knowledge_id, old_content, feedback)]
-        simple_updates = []    # [(knowledge_id, is_effective)]
+        evolution_tasks = []   # [(knowledge_id, old_content, feedback, eval_data)]
+        simple_updates = []    # [(knowledge_id, is_effective, eval_data)]
 
 
         for item in batch.feedback_list:
         for item in batch.feedback_list:
             knowledge_id = item.get("knowledge_id")
             knowledge_id = item.get("knowledge_id")
@@ -866,24 +877,25 @@ async def batch_update_knowledge(batch: KnowledgeBatchUpdateIn):
             if not row:
             if not row:
                 continue
                 continue
 
 
+            eval_data = json.loads(row["eval"])
+
             if is_effective and feedback:
             if is_effective and feedback:
-                evolution_tasks.append((knowledge_id, row["content"], feedback, row["metrics_helpful"]))
+                evolution_tasks.append((knowledge_id, row["content"], feedback, eval_data))
             else:
             else:
-                simple_updates.append((knowledge_id, is_effective, row["metrics_helpful"], row["metrics_harmful"]))
+                simple_updates.append((knowledge_id, is_effective, eval_data))
 
 
         # 执行简单更新
         # 执行简单更新
         now = datetime.now(timezone.utc).isoformat()
         now = datetime.now(timezone.utc).isoformat()
-        for knowledge_id, is_effective, cur_helpful, cur_harmful in simple_updates:
+        for knowledge_id, is_effective, eval_data in simple_updates:
             if is_effective:
             if is_effective:
-                conn.execute(
-                    "UPDATE knowledge SET metrics_helpful = ?, updated_at = ? WHERE id = ?",
-                    (cur_helpful + 1, now, knowledge_id)
-                )
+                eval_data["helpful"] = eval_data.get("helpful", 0) + 1
             else:
             else:
-                conn.execute(
-                    "UPDATE knowledge SET metrics_harmful = ?, updated_at = ? WHERE id = ?",
-                    (cur_harmful + 1, now, knowledge_id)
-                )
+                eval_data["harmful"] = eval_data.get("harmful", 0) + 1
+
+            conn.execute(
+                "UPDATE knowledge SET eval = ?, updated_at = ? WHERE id = ?",
+                (json.dumps(eval_data, ensure_ascii=False), now, knowledge_id)
+            )
 
 
         # 并发执行知识进化
         # 并发执行知识进化
         if evolution_tasks:
         if evolution_tasks:
@@ -891,10 +903,11 @@ async def batch_update_knowledge(batch: KnowledgeBatchUpdateIn):
             evolved_results = await asyncio.gather(
             evolved_results = await asyncio.gather(
                 *[_evolve_knowledge_with_llm(old, fb) for _, old, fb, _ in evolution_tasks]
                 *[_evolve_knowledge_with_llm(old, fb) for _, old, fb, _ in evolution_tasks]
             )
             )
-            for (knowledge_id, _, _, cur_helpful), evolved_content in zip(evolution_tasks, evolved_results):
+            for (knowledge_id, _, _, eval_data), evolved_content in zip(evolution_tasks, evolved_results):
+                eval_data["helpful"] = eval_data.get("helpful", 0) + 1
                 conn.execute(
                 conn.execute(
-                    "UPDATE knowledge SET content = ?, metrics_helpful = ?, updated_at = ? WHERE id = ?",
-                    (evolved_content, cur_helpful + 1, now, knowledge_id)
+                    "UPDATE knowledge SET content = ?, eval = ?, updated_at = ? WHERE id = ?",
+                    (evolved_content, json.dumps(eval_data, ensure_ascii=False), now, knowledge_id)
                 )
                 )
 
 
         conn.commit()
         conn.commit()
@@ -904,27 +917,29 @@ async def batch_update_knowledge(batch: KnowledgeBatchUpdateIn):
 
 
 
 
 @app.post("/api/knowledge/slim")
 @app.post("/api/knowledge/slim")
-async def slim_knowledge(model: str = "anthropic/claude-sonnet-4-5"):
+async def slim_knowledge(model: str = "google/gemini-2.0-flash-001"):
     """知识库瘦身:合并语义相似知识"""
     """知识库瘦身:合并语义相似知识"""
     conn = get_db()
     conn = get_db()
     try:
     try:
-        rows = conn.execute("SELECT * FROM knowledge ORDER BY metrics_helpful DESC").fetchall()
+        rows = conn.execute("SELECT * FROM knowledge").fetchall()
         if len(rows) < 2:
         if len(rows) < 2:
             return {"status": "ok", "message": f"知识库仅有 {len(rows)} 条,无需瘦身"}
             return {"status": "ok", "message": f"知识库仅有 {len(rows)} 条,无需瘦身"}
 
 
         # 构造发给大模型的内容
         # 构造发给大模型的内容
         entries_text = ""
         entries_text = ""
         for row in rows:
         for row in rows:
-            entries_text += f"[ID: {row['id']}] [Tags: {row['tags_type']}] "
-            entries_text += f"[Helpful: {row['metrics_helpful']}, Harmful: {row['metrics_harmful']}] [Score: {row['eval_score']}]\n"
-            entries_text += f"Scenario: {row['scenario']}\n"
+            eval_data = json.loads(row["eval"])
+            types = json.loads(row["types"])
+            entries_text += f"[ID: {row['id']}] [Types: {','.join(types)}] "
+            entries_text += f"[Helpful: {eval_data.get('helpful', 0)}, Harmful: {eval_data.get('harmful', 0)}] [Score: {eval_data.get('score', 3)}]\n"
+            entries_text += f"Task: {row['task']}\n"
             entries_text += f"Content: {row['content'][:200]}...\n\n"
             entries_text += f"Content: {row['content'][:200]}...\n\n"
 
 
         prompt = f"""你是一个 AI Agent 知识库管理员。以下是当前知识库的全部条目,请执行瘦身操作:
         prompt = f"""你是一个 AI Agent 知识库管理员。以下是当前知识库的全部条目,请执行瘦身操作:
 
 
 【任务】:
 【任务】:
 1. 识别语义高度相似或重复的知识,将它们合并为一条更精炼、更通用的知识。
 1. 识别语义高度相似或重复的知识,将它们合并为一条更精炼、更通用的知识。
-2. 合并时保留 helpful 最高的那条的 ID(metrics_helpful 取各条之和)。
+2. 合并时保留 helpful 最高的那条的 ID(helpful 取各条之和)。
 3. 对于独立的、无重复的知识,保持原样不动。
 3. 对于独立的、无重复的知识,保持原样不动。
 
 
 【当前知识库】:
 【当前知识库】:
@@ -933,11 +948,11 @@ async def slim_knowledge(model: str = "anthropic/claude-sonnet-4-5"):
 【输出格式要求】:
 【输出格式要求】:
 严格按以下格式输出每条知识,条目之间用 === 分隔:
 严格按以下格式输出每条知识,条目之间用 === 分隔:
 ID: <保留的id>
 ID: <保留的id>
-TAGS: <逗号分隔的type列表>
+TYPES: <逗号分隔的type列表>
 HELPFUL: <合并后的helpful计数>
 HELPFUL: <合并后的helpful计数>
 HARMFUL: <合并后的harmful计数>
 HARMFUL: <合并后的harmful计数>
 SCORE: <评分>
 SCORE: <评分>
-SCENARIO: <场景描述>
+TASK: <任务描述>
 CONTENT: <合并后的知识内容>
 CONTENT: <合并后的知识内容>
 ===
 ===
 
 
@@ -966,15 +981,16 @@ REPORT: 原有 X 条,合并后 Y 条,精简了 Z 条。
                 continue
                 continue
 
 
             lines = block.split("\n")
             lines = block.split("\n")
-            kid, tags, helpful, harmful, score, scenario, content_lines = None, "", 0, 0, 3, "", []
+            kid, types, helpful, harmful, score, task, content_lines = None, [], 0, 0, 3, "", []
             current_field = None
             current_field = None
 
 
             for line in lines:
             for line in lines:
                 if line.startswith("ID:"):
                 if line.startswith("ID:"):
                     kid = line[3:].strip()
                     kid = line[3:].strip()
                     current_field = None
                     current_field = None
-                elif line.startswith("TAGS:"):
-                    tags = line[5:].strip()
+                elif line.startswith("TYPES:"):
+                    types_str = line[6:].strip()
+                    types = [t.strip() for t in types_str.split(",") if t.strip()]
                     current_field = None
                     current_field = None
                 elif line.startswith("HELPFUL:"):
                 elif line.startswith("HELPFUL:"):
                     try:
                     try:
@@ -994,25 +1010,25 @@ REPORT: 原有 X 条,合并后 Y 条,精简了 Z 条。
                     except Exception:
                     except Exception:
                         score = 3
                         score = 3
                     current_field = None
                     current_field = None
-                elif line.startswith("SCENARIO:"):
-                    scenario = line[9:].strip()
-                    current_field = "scenario"
+                elif line.startswith("TASK:"):
+                    task = line[5:].strip()
+                    current_field = "task"
                 elif line.startswith("CONTENT:"):
                 elif line.startswith("CONTENT:"):
                     content_lines.append(line[8:].strip())
                     content_lines.append(line[8:].strip())
                     current_field = "content"
                     current_field = "content"
-                elif current_field == "scenario":
-                    scenario += "\n" + line
+                elif current_field == "task":
+                    task += "\n" + line
                 elif current_field == "content":
                 elif current_field == "content":
                     content_lines.append(line)
                     content_lines.append(line)
 
 
             if kid and content_lines:
             if kid and content_lines:
                 new_entries.append({
                 new_entries.append({
                     "id": kid,
                     "id": kid,
-                    "tags": tags,
+                    "types": types if types else ["strategy"],
                     "helpful": helpful,
                     "helpful": helpful,
                     "harmful": harmful,
                     "harmful": harmful,
                     "score": score,
                     "score": score,
-                    "scenario": scenario.strip(),
+                    "task": task.strip(),
                     "content": "\n".join(content_lines).strip()
                     "content": "\n".join(content_lines).strip()
                 })
                 })
 
 
@@ -1023,18 +1039,40 @@ REPORT: 原有 X 条,合并后 Y 条,精简了 Z 条。
         now = datetime.now(timezone.utc).isoformat()
         now = datetime.now(timezone.utc).isoformat()
         conn.execute("DELETE FROM knowledge")
         conn.execute("DELETE FROM knowledge")
         for e in new_entries:
         for e in new_entries:
+            eval_data = {
+                "score": e["score"],
+                "helpful": e["helpful"],
+                "harmful": e["harmful"],
+                "confidence": 0.9,
+                "helpful_history": [],
+                "harmful_history": []
+            }
+            source = {
+                "name": "slim",
+                "category": "exp",
+                "urls": [],
+                "agent_id": "slim",
+                "submitted_by": "system",
+                "timestamp": now
+            }
             conn.execute(
             conn.execute(
                 """INSERT INTO knowledge
                 """INSERT INTO knowledge
-                (id, message_id, tags_type, scenario, content,
-                 source_urls, source_agent_id, source_timestamp,
-                 eval_score, eval_helpful, eval_harmful,
-                 eval_helpful_history, eval_harmful_history,
-                 metrics_helpful, metrics_harmful, created_at, updated_at)
-                VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)""",
-                (e["id"], "", e["tags"], e["scenario"], e["content"],
-                 "", "slim", now,
-                 e["score"], 0, 0, "[]", "[]",
-                 e["helpful"], e["harmful"], now, now)
+                (id, message_id, types, task, tags, scopes, owner, content, source, eval, created_at, updated_at)
+                VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)""",
+                (
+                    e["id"],
+                    "",
+                    json.dumps(e["types"]),
+                    e["task"],
+                    json.dumps({}),
+                    json.dumps(["org:cybertogether"]),
+                    "agent:slim",
+                    e["content"],
+                    json.dumps(source, ensure_ascii=False),
+                    json.dumps(eval_data, ensure_ascii=False),
+                    now,
+                    now
+                )
             )
             )
         conn.commit()
         conn.commit()
 
 

+ 288 - 0
knowhub/skill/SKILL.md

@@ -0,0 +1,288 @@
+---
+name: knowhub
+description: 遇到超出现有能力的问题或复杂任务时,查询工具和资源的使用经验,回报自己的使用心得
+category: tool-discovery
+scope: agent:*
+---
+
+# KnowHub 知识管理
+
+KnowHub 提供统一的知识库,用于 agent 之间共享和复用经验。
+
+## 何时使用
+
+**查询知识**:
+- 需要外部工具/库但不确定选哪个
+- 遇到复杂任务,想知道别人用了什么工具、怎么做的
+- 多次尝试失败,可能有人踩过同样的坑
+- 可能需要外部资源(论文、文档、博客等)
+
+**保存知识**(重要!):
+- ✅ **使用资源后立即提交** — 完成任务、有了结果就提交,不要等到上下文太长才想起来
+- ✅ **获得用户反馈后补充** — 如果用户指出问题或给出建议,视情况更新或补充经验
+- ✅ **搜索过程中的发现** — 在哪个平台找到的、用了什么关键词、哪些策略有效,都可以提交
+
+## 安装配置
+
+```bash
+pip install httpx python-dotenv
+export KNOWHUB_API=http://localhost:8000
+```
+
+## 快速开始
+
+```bash
+# 搜索知识
+python -m knowhub.cli search "Python 异步编程"
+
+# 保存知识
+python -m knowhub.cli save \
+  --scenario "FastAPI 参数验证" \
+  --content "使用 Query(...) 定义查询参数,支持验证" \
+  --tags solution \
+  --score 4
+
+# 列出知识
+python -m knowhub.cli list --limit 10
+
+# 反馈
+python -m knowhub.cli batch-update --id knowledge-xxx --effective
+```
+
+## 工作流程
+
+### 1. 查询知识
+
+```bash
+python -m knowhub.cli search "任务描述" [--top-k 5] [--min-score 3] [--tags solution]
+```
+
+返回按质量分排序的知识:`quality_score = score + helpful - (harmful * 2.0)`
+
+**有可用结果**:直接参考使用
+
+**无可用结果**:进入步骤 2 自行寻找
+
+### 2. 自行寻找(仅在无经验时)
+
+根据已知信息选择入口:
+
+- **已知工具名**:直接去对应平台(PyPI / npmjs / GitHub)查文档和社区反馈
+  - 搜索:`<工具名> review`、`<工具名> vs`、`site:reddit.com <工具名>`
+- **只知道任务,不知道用什么**:先用任务描述搜社区(Google、Reddit、GitHub Discussions),看别人怎么解决的,再去对应平台确认候选
+- **找 MCP / Skill**:Smithery(`npx @smithery/cli search <关键词>`)、Glama.ai、awesome-mcp-servers
+- **找库/包**:PyPI、npmjs.com
+- **找一站式集成**:Composio(850+ 工具)、LangChain Tools
+
+评估时关注:社区活跃度、工程质量、可持续性、License 兼容性
+
+简单逻辑自己写比引入依赖更快时,直接写。
+
+### 3. 保存知识
+
+⚠️ **重要:完成任务后立即提交,不要拖到上下文太长才想起来!**
+
+```bash
+python -m knowhub.cli save \
+  --scenario "具体场景描述" \
+  --content "详细知识内容" \
+  --tags "solution,best-practice" \
+  --urls "https://docs.example.com" \
+  --score 4 \
+  --agent-id "my_agent" \
+  --message-id "msg-001"
+```
+
+**标签类型**:
+- `solution` - 问题解决方案
+- `best-practice` - 最佳实践
+- `pitfall` - 常见陷阱/注意事项
+- `comparison` - 工具/方案对比
+- `strategy` - 策略/方法论
+- `resource` - 资源推荐
+
+**评分标准**(1-5 分):
+- 5 分:非常有用,解决了关键问题
+- 4 分:有用,提供了有价值的信息
+- 3 分:一般,可能有参考价值
+- 2 分:价值有限
+- 1 分:几乎没用
+
+**两类经验的区分**:
+
+1. **对资源本身的使用经验** — 提交为该资源的知识
+   - 例如:使用 pymupdf 提取 PDF 表格的经验 → scenario: "pymupdf: PDF 表格提取"
+
+2. **对搜索平台/策略的经验** — 提交为平台或搜索策略的知识
+   - 对平台本身的评价 → scenario: "Smithery: 搜索 MCP server"
+   - 关于找工具/找资源的策略、方法论 → scenario: "工具发现策略: 使用 Reddit 搜索"
+
+## 命令参考
+
+### 更新知识
+```bash
+python -m knowhub.cli update knowledge-xxx \
+  --content "更新后的内容" \
+  --score 5 \
+  --tags "solution,verified"
+```
+
+### 批量反馈
+```bash
+# 单条反馈
+python -m knowhub.cli batch-update \
+  --id knowledge-xxx \
+  --effective \
+  --feedback "这个方案确实有效"
+
+# 批量反馈(从文件)
+python -m knowhub.cli batch-update --file feedback.json
+```
+
+`feedback.json` 格式:
+```json
+[
+  {
+    "knowledge_id": "knowledge-xxx",
+    "is_effective": true,
+    "feedback": "很有用"
+  }
+]
+```
+
+## HTTP API
+
+如需自定义集成,可直接调用 HTTP API:
+
+**搜索**:
+```bash
+curl "http://localhost:8000/api/knowledge/search?q=查询&top_k=5&min_score=3"
+```
+
+**保存**:
+```bash
+curl -X POST http://localhost:8000/api/knowledge \
+  -H "Content-Type: application/json" \
+  -d '{
+    "message_id": "msg-001",
+    "tags_type": ["solution"],
+    "scenario": "场景描述",
+    "content": "知识内容",
+    "urls": ["https://example.com"],
+    "agent_id": "my_agent",
+    "score": 4
+  }'
+```
+
+**更新**:
+```bash
+curl -X PUT http://localhost:8000/api/knowledge/knowledge-xxx \
+  -H "Content-Type: application/json" \
+  -d '{"content": "更新内容", "eval_score": 5}'
+```
+
+**批量反馈**:
+```bash
+curl -X POST http://localhost:8000/api/knowledge/batch_update \
+  -H "Content-Type: application/json" \
+  -d '{
+    "feedback_list": [{
+      "knowledge_id": "knowledge-xxx",
+      "is_effective": true,
+      "feedback": "很有用"
+    }]
+  }'
+```
+
+## 最佳实践
+
+### 何时保存
+
+✅ **应该保存**:
+- 解决了一个具体问题
+- 发现了工具的最佳用法
+- 踩了坑并找到解决方案
+- 对比了多个方案并得出结论
+- 发现了文档中没有的技巧
+- 找到了有用的资源或平台
+- 总结了有效的搜索策略
+
+❌ **不应该保存**:
+- 纯粹的文档摘抄(没有实践经验)
+- 过于泛泛的描述("这个工具很好用")
+- 未经验证的猜测
+- 已经有相同知识的重复内容
+
+### 如何写好知识
+
+**scenario(场景)**:
+- ✅ "FastAPI 参数验证:自定义错误消息"
+- ✅ "Python 异步编程:并发执行 100+ HTTP 请求"
+- ✅ "工具发现:使用 Reddit 搜索 Python 库"
+- ❌ "FastAPI 使用"(太泛)
+- ❌ "参数验证"(缺少上下文)
+
+**content(内容)**:
+- ✅ 具体可操作:"使用 Query(..., ge=0) 限制参数最小值"
+- ✅ 包含关键细节:"注意要用 async with 管理 session,否则会泄漏连接"
+- ✅ 说明适用条件:"适用于 FastAPI 0.100+,旧版本用法不同"
+- ✅ 搜索策略:"在 Reddit 搜索 'python pdf library' 比 Google 更容易找到实战经验"
+- ❌ 泛泛而谈:"这个方法很好用"
+- ❌ 过于冗长:复制粘贴大段文档
+
+**tags(标签)**:
+- 选择最相关的 1-3 个标签
+- `solution` - 问题的解决方案
+- `best-practice` - 推荐的做法
+- `pitfall` - 需要避免的陷阱
+- `comparison` - 方案对比
+- `strategy` - 策略/方法论(包括搜索策略)
+- `resource` - 资源推荐
+
+## 示例
+
+### 保存解决方案
+```bash
+python -m knowhub.cli save \
+  --scenario "asyncio.gather() 错误处理" \
+  --content "默认情况下,一个协程失败会导致 gather() 抛异常。使用 return_exceptions=True 继续执行其他协程。" \
+  --tags pitfall \
+  --score 5
+```
+
+### 保存对比
+```bash
+python -m knowhub.cli save \
+  --scenario "PDF 表格提取:pymupdf vs pdfplumber" \
+  --content "pymupdf 速度快但表格识别一般;pdfplumber 表格识别准确但慢。复杂表格用 pdfplumber。" \
+  --tags comparison \
+  --score 4
+```
+
+### 保存搜索策略
+```bash
+python -m knowhub.cli save \
+  --scenario "工具发现策略:使用 Reddit 搜索实战经验" \
+  --content "搜索 'site:reddit.com python <任务描述>' 比直接 Google 更容易找到真实使用体验和踩坑经验。关注 r/Python、r/learnpython 等子版块。" \
+  --tags strategy \
+  --score 4
+```
+
+### 保存平台评价
+```bash
+python -m knowhub.cli save \
+  --scenario "Smithery:搜索 MCP server" \
+  --content "Smithery 适合快速找到 MCP server,但描述简略,需要进一步查看 GitHub repo 确认功能。搜索结果按相关度排序,质量较高。" \
+  --tags resource \
+  --urls "https://smithery.ai" \
+  --score 4
+```
+
+## 重要提醒
+
+1. **主动搜索** - 开始新任务前主动搜索相关知识,避免重复踩坑
+2. **及时保存** - 完成任务后立即保存,不要等到忘记细节
+3. **质量优先** - 宁可少保存,也要保证质量。一条高质量知识胜过十条低质量
+4. **持续反馈** - 使用知识后给予反馈,帮助提升知识库质量
+5. **避免重复** - 保存前可以先搜索,避免重复保存相同知识
+6. **记录策略** - 不仅记录工具使用经验,也记录如何找到工具的策略

+ 280 - 0
knowhub/skill/cli.py

@@ -0,0 +1,280 @@
+#!/usr/bin/env python3
+"""
+KnowHub CLI - 知识管理命令行工具
+
+使用方法:
+    python -m knowhub.skill.cli search "查询内容"
+    python -m knowhub.skill.cli save --task "任务" --content "内容" --types strategy
+    python -m knowhub.skill.cli list --limit 10
+"""
+
+import os
+import sys
+import json
+import argparse
+from pathlib import Path
+
+try:
+    import httpx
+except ImportError:
+    print("错误: 需要安装 httpx 库")
+    print("运行: pip install httpx")
+    sys.exit(1)
+
+
+def get_api_base() -> str:
+    """获取 API 地址"""
+    return os.getenv("KNOWHUB_API", "http://localhost:8000")
+
+
+def search_knowledge(args):
+    """搜索知识"""
+    url = f"{get_api_base()}/api/knowledge/search"
+    params = {
+        "q": args.query,
+        "top_k": args.top_k,
+        "min_score": args.min_score,
+    }
+    if args.types:
+        params["types"] = args.types
+
+    try:
+        response = httpx.get(url, params=params, timeout=30.0)
+        response.raise_for_status()
+        data = response.json()
+
+        if data["count"] == 0:
+            print("未找到相关知识")
+            return
+
+        print(f"找到 {data['count']} 条知识:\n")
+        for i, item in enumerate(data["results"], 1):
+            print(f"[{i}] {item['task']}")
+            print(f"    ID: {item['id']}")
+            eval_data = item.get("eval", {})
+            print(f"    评分: {eval_data.get('score', 3)} | 质量分: {item.get('quality_score', 'N/A')}")
+            print(f"    类型: {', '.join(item.get('types', []))}")
+            print(f"    内容: {item['content'][:100]}...")
+            print()
+
+    except httpx.HTTPError as e:
+        print(f"请求失败: {e}")
+        sys.exit(1)
+
+
+def save_knowledge(args):
+    """保存知识"""
+    url = f"{get_api_base()}/api/knowledge"
+
+    data = {
+        "message_id": args.message_id or f"cli-{os.getpid()}",
+        "types": args.types.split(",") if args.types else ["strategy"],
+        "task": args.task,
+        "tags": json.loads(args.tags) if args.tags else {},
+        "scopes": args.scopes.split(",") if args.scopes else ["org:cybertogether"],
+        "owner": args.owner or "agent:cli",
+        "content": args.content,
+        "source": {
+            "name": args.source_name or "cli",
+            "category": args.source_category or "exp",
+            "urls": args.urls.split(",") if args.urls else [],
+            "agent_id": args.agent_id or "cli",
+            "submitted_by": args.submitted_by or "cli-user",
+        },
+        "eval": {
+            "score": args.score,
+            "helpful": 1,
+            "harmful": 0,
+            "confidence": 0.5,
+        }
+    }
+
+    try:
+        response = httpx.post(url, json=data, timeout=30.0)
+        response.raise_for_status()
+        result = response.json()
+        print(f"✅ 知识已保存: {result['knowledge_id']}")
+
+    except httpx.HTTPError as e:
+        print(f"保存失败: {e}")
+        sys.exit(1)
+
+
+def update_knowledge(args):
+    """更新知识"""
+    url = f"{get_api_base()}/api/knowledge/{args.id}"
+
+    data = {}
+    if args.score:
+        data["update_score"] = args.score
+    if args.helpful_case:
+        data["add_helpful_case"] = args.helpful_case
+    if args.harmful_case:
+        data["add_harmful_case"] = args.harmful_case
+    if args.evolve_feedback:
+        data["evolve_feedback"] = args.evolve_feedback
+
+    if not data:
+        print("错误: 至少需要提供一个更新参数")
+        sys.exit(1)
+
+    try:
+        response = httpx.put(url, json=data, timeout=30.0)
+        response.raise_for_status()
+        print(f"✅ 知识已更新: {args.id}")
+
+    except httpx.HTTPError as e:
+        print(f"更新失败: {e}")
+        sys.exit(1)
+
+
+def batch_update_knowledge(args):
+    """批量更新知识"""
+    url = f"{get_api_base()}/api/knowledge/batch_update"
+
+    # 从文件读取反馈列表
+    if args.file:
+        with open(args.file, 'r') as f:
+            feedback_list = json.load(f)
+    else:
+        print("错误: 需要提供 --file 参数")
+        sys.exit(1)
+
+    data = {"feedback_list": feedback_list}
+
+    try:
+        response = httpx.post(url, json=data, timeout=60.0)
+        response.raise_for_status()
+        result = response.json()
+        print(f"✅ 批量更新完成: {result['updated']} 条知识")
+
+    except httpx.HTTPError as e:
+        print(f"批量更新失败: {e}")
+        sys.exit(1)
+
+
+def list_knowledge(args):
+    """列出知识"""
+    url = f"{get_api_base()}/api/knowledge"
+    params = {"limit": args.limit}
+    if args.types:
+        params["types"] = args.types
+    if args.scopes:
+        params["scopes"] = args.scopes
+
+    try:
+        response = httpx.get(url, params=params, timeout=30.0)
+        response.raise_for_status()
+        data = response.json()
+
+        if data["count"] == 0:
+            print("知识库为空")
+            return
+
+        print(f"共 {data['count']} 条知识:\n")
+        for i, item in enumerate(data["results"], 1):
+            print(f"[{i}] {item['task']}")
+            print(f"    ID: {item['id']}")
+            eval_data = item.get("eval", {})
+            print(f"    评分: {eval_data.get('score', 3)} | Helpful: {eval_data.get('helpful', 0)} | Harmful: {eval_data.get('harmful', 0)}")
+            print(f"    类型: {', '.join(item.get('types', []))}")
+            print(f"    所有者: {item.get('owner', 'N/A')}")
+            print()
+
+    except httpx.HTTPError as e:
+        print(f"请求失败: {e}")
+        sys.exit(1)
+
+
+def slim_knowledge(args):
+    """知识瘦身"""
+    url = f"{get_api_base()}/api/knowledge/slim"
+    params = {"model": args.model}
+
+    try:
+        print("正在执行知识瘦身,这可能需要一些时间...")
+        response = httpx.post(url, params=params, timeout=120.0)
+        response.raise_for_status()
+        result = response.json()
+        print(f"✅ 瘦身完成: {result['before']} → {result['after']} 条知识")
+        if result.get("report"):
+            print(f"   {result['report']}")
+
+    except httpx.HTTPError as e:
+        print(f"瘦身失败: {e}")
+        sys.exit(1)
+
+
+def main():
+    parser = argparse.ArgumentParser(description="KnowHub CLI - 知识管理工具")
+    subparsers = parser.add_subparsers(dest="command", help="可用命令")
+
+    # search 命令
+    search_parser = subparsers.add_parser("search", help="搜索知识")
+    search_parser.add_argument("query", help="查询文本")
+    search_parser.add_argument("--top-k", type=int, default=5, help="返回结果数量")
+    search_parser.add_argument("--min-score", type=int, default=3, help="最低评分")
+    search_parser.add_argument("--types", help="类型过滤(逗号分隔)")
+
+    # save 命令
+    save_parser = subparsers.add_parser("save", help="保存知识")
+    save_parser.add_argument("--task", required=True, help="任务描述")
+    save_parser.add_argument("--content", required=True, help="知识内容")
+    save_parser.add_argument("--types", default="strategy", help="类型(逗号分隔)")
+    save_parser.add_argument("--tags", help="标签(JSON 格式)")
+    save_parser.add_argument("--scopes", help="可见范围(逗号分隔)")
+    save_parser.add_argument("--owner", help="所有者")
+    save_parser.add_argument("--source-name", help="来源名称")
+    save_parser.add_argument("--source-category", help="来源类别")
+    save_parser.add_argument("--urls", help="相关 URL(逗号分隔)")
+    save_parser.add_argument("--agent-id", help="Agent ID")
+    save_parser.add_argument("--submitted-by", help="提交者")
+    save_parser.add_argument("--message-id", help="消息 ID")
+    save_parser.add_argument("--score", type=int, default=3, help="评分 (1-5)")
+
+    # update 命令
+    update_parser = subparsers.add_parser("update", help="更新知识")
+    update_parser.add_argument("id", help="知识 ID")
+    update_parser.add_argument("--score", type=int, help="更新评分")
+    update_parser.add_argument("--helpful-case", help="添加有效案例")
+    update_parser.add_argument("--harmful-case", help="添加有害案例")
+    update_parser.add_argument("--evolve-feedback", help="知识进化反馈")
+
+    # batch-update 命令
+    batch_parser = subparsers.add_parser("batch-update", help="批量更新知识")
+    batch_parser.add_argument("--file", required=True, help="反馈列表 JSON 文件")
+
+    # list 命令
+    list_parser = subparsers.add_parser("list", help="列出知识")
+    list_parser.add_argument("--limit", type=int, default=10, help="返回数量")
+    list_parser.add_argument("--types", help="类型过滤")
+    list_parser.add_argument("--scopes", help="范围过滤")
+
+    # slim 命令
+    slim_parser = subparsers.add_parser("slim", help="知识瘦身")
+    slim_parser.add_argument("--model", default="google/gemini-2.0-flash-001", help="使用的模型")
+
+    args = parser.parse_args()
+
+    if not args.command:
+        parser.print_help()
+        sys.exit(1)
+
+    # 执行命令
+    if args.command == "search":
+        search_knowledge(args)
+    elif args.command == "save":
+        save_knowledge(args)
+    elif args.command == "update":
+        update_knowledge(args)
+    elif args.command == "batch-update":
+        batch_update_knowledge(args)
+    elif args.command == "list":
+        list_knowledge(args)
+    elif args.command == "slim":
+        slim_knowledge(args)
+
+
+if __name__ == "__main__":
+    main()
+

+ 0 - 163
knowhub/skill/knowhub.md

@@ -1,163 +0,0 @@
----
-name: knowhub
-description: 遇到超出现有能力的问题或复杂任务时,查询工具和资源的使用经验,回报自己的使用心得
-category: tool-discovery
-scope: agent:*
----
-
-## When to use
-
-**查询经验**:
-- 需要外部工具/库但不确定选哪个
-- 遇到复杂任务,想知道别人用了什么工具、怎么做的
-- 多次尝试失败,可能有人踩过同样的坑
-- 可能需要外部资源(论文、文档、博客等)
-
-**提交经验**(重要!):
-- ✅ **使用资源后立即提交** — 完成任务、有了结果就提交,不要等到上下文太长才想起来
-- ✅ **获得用户反馈后补充** — 如果用户指出问题或给出建议,视情况更新或补充经验
-- ✅ **搜索过程中的发现** — 在哪个平台找到的、用了什么关键词、哪些策略有效,都可以提交
-
-## 工作流
-
-### 1. 查经验
-
-```bash
-curl -s "http://localhost:8000/api/search?q=关键词"
-```
-
-**有可用结果**:直接参考。结果包含资源名、URL、评分和具体使用建议。
-
-**需要深入**:结果中如果有 `content_id`,可以获取详细内容:
-
-```bash
-curl -s "http://localhost:8000/api/content/{content_id}"
-```
-
-返回内容正文 + 导航(children / prev / next),可以继续浏览相关章节。
-
-**无可用结果**:进入步骤 2 自行寻找。
-
-### 2. 自行寻找(仅在无经验时)
-
-根据已知信息选择入口:
-
-- **已知工具名**:直接去对应平台(PyPI / npmjs / GitHub)查文档和社区反馈,搜 `<工具名> review`、`<工具名> vs`、`site:reddit.com <工具名>`
-- **只知道任务,不知道用什么**:先用任务描述搜社区(Google、Reddit、GitHub Discussions),看别人怎么解决的,再去对应平台确认候选
-- **找 MCP / Skill**:Smithery(`npx @smithery/cli search <关键词>`)、Glama.ai、awesome-mcp-servers
-- **找库/包**:PyPI、npmjs.com
-- **找一站式集成**:Composio(850+ 工具)、LangChain Tools
-
-评估时关注社区活跃度、工程质量、可持续性、License 兼容性。
-简单逻辑自己写比引入依赖更快时,直接写。
-
-### 3. 回报经验
-
-**⚠️ 重要:完成任务后立即提交,不要拖到上下文太长才想起来!**
-
-使用资源后,提交真实体验:
-
-```bash
-curl -s -X POST "http://localhost:8000/api/experience" \
-  -H "Content-Type: application/json" \
-  -d '{
-    "name": "资源名称",
-    "url": "来源URL",
-    "category": "mcp|skill|library|api|paper|blog|book|course",
-    "task": "你在做什么(具体场景)",
-    "score": 4,
-    "outcome": "结果如何,优缺点",
-    "tips": "最关键的一条建议",
-    "submitted_by": "'$(git config user.email)'"
-  }'
-```
-
-**提交时机**:
-- ✅ 使用资源完成任务后,有了结果就提交
-- ✅ 如果后续获得用户反馈(指出问题、给出建议),视情况再提交一条补充经验
-
-**两类经验的区分**:
-
-1. **对资源本身的使用经验** — 提交为该资源的 experience
-   - 例如:使用 pymupdf 提取 PDF 表格的经验 → `name: "pymupdf"`
-
-2. **对搜索平台/策略的经验** — 提交为平台的 experience 或 knowhub 的 experience
-   - 对平台本身的评价 → 提交为该平台的经验(`name: "smithery"` / `"pypi"` / `"github-search"`)
-   - 关于找工具/找资源的策略、方法论 → 提交为 knowhub 的经验(`name: "knowhub"`)
-
-**搜索策略经验示例**:
-
-```bash
-# 对搜索平台的评价
-curl -s -X POST "http://localhost:8000/api/experience" \
-  -H "Content-Type: application/json" \
-  -d '{
-    "name": "smithery",
-    "url": "https://smithery.ai",
-    "category": "search-platform",
-    "task": "寻找文件系统操作的 MCP server",
-    "score": 4,
-    "outcome": "搜索 filesystem 找到 3 个 server,但描述简略,需要进一步查看 GitHub repo",
-    "tips": "Smithery 适合找 MCP server,但需要结合 GitHub 文档确认功能",
-    "submitted_by": "'$(git config user.email)'"
-  }'
-
-# 关于搜索策略的经验
-curl -s -X POST "http://localhost:8000/api/experience" \
-  -H "Content-Type: application/json" \
-  -d '{
-    "name": "knowhub",
-    "url": "http://localhost:8000",
-    "category": "search-platform",
-    "task": "寻找 PDF 表格提取的 Python 库",
-    "score": 5,
-    "outcome": "搜索 pdf extract table python 直接找到 pymupdf 的高分经验,节省了大量调研时间",
-    "tips": "关键词要包含具体操作动词(extract/parse)+ 目标对象(table)+ 技术栈(python)",
-    "submitted_by": "'$(git config user.email)'"
-  }'
-```
-
-**字段填写要求**:
-- **name** — 资源的通用名称。工具填包名(`pymupdf`),论文填标题(`Attention Is All You Need`),博客填文章标题
-- **url** — 资源的规范来源地址(GitHub repo / arXiv / 官方文档)
-- **category** — 资源类型,建议用:`mcp | skill | library | api | paper | blog | book | course`
-- **task** — 描述具体场景,不要写泛泛的"处理 PDF"
-- **tips** — 要具体可操作:"双栏论文用 page.get_text(sort=True)",不要写"挺好用的"
-- **submitted_by** — 可选,建议用 `git config user.email`
-
-**多资源协作场景**:
-如果任务用了多个工具/资源(如 pymupdf + langchain + openai),为每个关键资源单独提交一条 experience,在 task/tips/outcome 中提及配合使用的其他资源。这样每个资源都能被搜索到,同时保留协作信息。
-
-### 4. 提交内容(可选)
-
-如果你获取了资源的详细信息(目录、章节内容等),可以提交供后续 Agent 使用:
-
-```bash
-# 提交资源概要/目录
-curl -s -X POST "http://localhost:8000/api/content" \
-  -H "Content-Type: application/json" \
-  -d '{
-    "id": "resource-name",
-    "title": "资源标题",
-    "body": "Markdown 格式的概要或目录内容",
-    "submitted_by": "'$(git config user.email)'"
-  }'
-
-# 提交具体章节
-curl -s -X POST "http://localhost:8000/api/content" \
-  -H "Content-Type: application/json" \
-  -d '{
-    "id": "resource-name/section-key",
-    "title": "章节标题",
-    "body": "Markdown 格式的章节内容",
-    "sort_order": 1,
-    "submitted_by": "'$(git config user.email)'"
-  }'
-```
-
-- body 使用 Markdown 格式,图片用 URL 引用
-- 根节点 ID 不含 `/`(如 `pymupdf`),子节点 ID 含 `/`(如 `pymupdf/find-tables`)
-- 只提交对后续 Agent 有价值的内容,不要灌入全文
-
-## 重要提醒
-务必要向平台提交可靠的经验!不允许因为其他经验机制忽略向knowhub提交经验。