|
|
@@ -1,678 +0,0 @@
|
|
|
-# 知识库反馈与管理机制优化提案
|
|
|
-
|
|
|
-> 本文档记录知识库反馈机制和规模管理的优化方案
|
|
|
->
|
|
|
-> 讨论日期:2026-03-17
|
|
|
-> 状态:提案阶段,待审阅后实施
|
|
|
-
|
|
|
----
|
|
|
-
|
|
|
-## 一、背景与问题
|
|
|
-
|
|
|
-### 1.1 当前反馈机制
|
|
|
-
|
|
|
-**现有结构**:
|
|
|
-- `eval` 字段:score (1-5), helpful/harmful 计数, confidence, 历史记录
|
|
|
-- 工具:`knowledge_update`, `knowledge_batch_update`
|
|
|
-- 应用:`min_score` 过滤、知识进化(`evolve_feedback`)
|
|
|
-
|
|
|
-**存在的问题**:
|
|
|
-1. 反馈来源不区分(人类、Agent、任务结果混在一起)
|
|
|
-2. 评分更新逻辑简单(手动设置,未根据反馈历史自动调整)
|
|
|
-3. 缺少隐式反馈(使用频率、检索排名等)
|
|
|
-4. 缺少时间衰减机制(旧知识可能过时)
|
|
|
-
|
|
|
-### 1.2 规模控制问题
|
|
|
-
|
|
|
-**现有 slim 机制的问题**:
|
|
|
-- 一次性加载 10000 条知识到内存
|
|
|
-- 单次 LLM 调用处理全部(成本高 $1-5/次,质量差)
|
|
|
-- 每条只截取前 200 字符,信息不完整
|
|
|
-
|
|
|
-**知识库膨胀原因**:
|
|
|
-1. 重复提取:相似任务多次执行,每次都提取"新"知识
|
|
|
-2. 粒度不一致:同一经验被拆成多条或合并成粗粒度
|
|
|
-3. 版本演化:知识更新时创建新版本而非覆盖旧版本
|
|
|
-4. 低质量沉积:score=3 的"中等"知识大量累积
|
|
|
-
|
|
|
----
|
|
|
-
|
|
|
-## 二、核心优化方案
|
|
|
-
|
|
|
-### 2.1 保存时关系判断(P0 核心机制)
|
|
|
-
|
|
|
-#### 知识关系类型
|
|
|
-
|
|
|
-| 关系类型 | 说明 | 处理策略 |
|
|
|
-|---------|------|---------|
|
|
|
-| `duplicate` | 完全重复,只是表述略有差异 | 跳过保存 |
|
|
|
-| `subset` | 新知识是已有知识的特例或部分 | 跳过保存,或作为案例添加 |
|
|
|
-| `superset` | 新知识更全面,包含已有知识 | 保存新知识,废弃旧知识 |
|
|
|
-| `conflict` | 两条知识给出矛盾的建议 | 保存但标记冲突,需要人工审核 |
|
|
|
-| `complement` | 相关但不重复,可以互相补充 | 保存并建立关联关系 |
|
|
|
-| `independent` | 两条知识无关 | 直接保存 |
|
|
|
-
|
|
|
-#### 分层判断策略(降低成本)
|
|
|
-
|
|
|
-```
|
|
|
-Layer 1: 向量相似度检索(快速过滤)
|
|
|
- ↓ 无相似知识 → 直接保存
|
|
|
- ↓ 有相似知识
|
|
|
-Layer 2: 规则判断(免费)
|
|
|
- - task 完全相同 + content 重叠 > 90% → 跳过
|
|
|
- - content 完全相同 → 跳过
|
|
|
- ↓ 规则无法判断
|
|
|
-Layer 3: LLM 判断(仅处理边界情况)
|
|
|
- - 相似度 > 0.85 时才调用
|
|
|
- - 使用 gemini-2.5-flash-lite
|
|
|
-```
|
|
|
-
|
|
|
-**成本估算**:
|
|
|
-- 假设每天保存 50 条知识
|
|
|
-- Layer 1 过滤 70%,Layer 2 过滤 20%,Layer 3 处理 10%
|
|
|
-- 每次 LLM 调用:1100 tokens
|
|
|
-- 年成本:50 × 10% × 1100 tokens × 365 天 ≈ **$0.15/年**
|
|
|
-
|
|
|
-#### 实现位置
|
|
|
-
|
|
|
-- `agent/tools/builtin/knowledge.py:knowledge_save` - 保存前检查
|
|
|
-- `knowhub/server.py:analyze_knowledge_relation` - 关系分析
|
|
|
-- `knowhub/server.py:handle_knowledge_relation` - 关系处理
|
|
|
-
|
|
|
----
|
|
|
-
|
|
|
-### 2.2 反馈来源区分与加权评分(P0)
|
|
|
-
|
|
|
-#### 数据结构变更
|
|
|
-
|
|
|
-```python
|
|
|
-{
|
|
|
- "eval": {
|
|
|
- "score": 4.2, # 加权综合评分(自动计算)
|
|
|
- "confidence": 0.9,
|
|
|
-
|
|
|
- # 分来源统计
|
|
|
- "feedback_by_source": {
|
|
|
- "human": {
|
|
|
- "helpful": 3,
|
|
|
- "harmful": 0,
|
|
|
- "weight": 1.0, # 权重最高
|
|
|
- "last_feedback": "2026-03-17"
|
|
|
- },
|
|
|
- "agent_explicit": {
|
|
|
- "helpful": 12,
|
|
|
- "harmful": 2,
|
|
|
- "weight": 0.6, # 中等权重
|
|
|
- "last_feedback": "2026-03-17"
|
|
|
- },
|
|
|
- "task_outcome": {
|
|
|
- "success": 45,
|
|
|
- "failure": 5,
|
|
|
- "weight": 0.3, # 权重最低(归因不明确)
|
|
|
- "last_feedback": "2026-03-17"
|
|
|
- }
|
|
|
- },
|
|
|
-
|
|
|
- # 详细历史(保留来源标记)
|
|
|
- "feedback_history": [
|
|
|
- {
|
|
|
- "source": "human",
|
|
|
- "type": "helpful",
|
|
|
- "comment": "非常准确",
|
|
|
- "timestamp": "2026-03-17",
|
|
|
- "user_id": "user@example.com"
|
|
|
- }
|
|
|
- ]
|
|
|
- }
|
|
|
-}
|
|
|
-```
|
|
|
-
|
|
|
-#### 加权评分算法
|
|
|
-
|
|
|
-```python
|
|
|
-def calculate_weighted_score(feedback_by_source):
|
|
|
- """根据来源加权计算综合评分"""
|
|
|
-
|
|
|
- total_weight = 0
|
|
|
- weighted_sum = 0
|
|
|
-
|
|
|
- for source, data in feedback_by_source.items():
|
|
|
- helpful = data["helpful"]
|
|
|
- harmful = data["harmful"]
|
|
|
- weight = data["weight"]
|
|
|
-
|
|
|
- if helpful + harmful == 0:
|
|
|
- continue
|
|
|
-
|
|
|
- # 正向率
|
|
|
- positive_ratio = helpful / (helpful + harmful)
|
|
|
-
|
|
|
- # 置信度:反馈次数越多越可信(上限10次)
|
|
|
- confidence = min(1.0, (helpful + harmful) / 10)
|
|
|
-
|
|
|
- # 该来源的得分:3 + 2 * (正向率 - 0.5)
|
|
|
- source_score = 3 + 2 * (positive_ratio - 0.5)
|
|
|
-
|
|
|
- # 加权累加
|
|
|
- weighted_sum += source_score * weight * confidence
|
|
|
- total_weight += weight * confidence
|
|
|
-
|
|
|
- return max(1.0, min(5.0, weighted_sum / total_weight)) if total_weight > 0 else 3.0
|
|
|
-```
|
|
|
-
|
|
|
-#### 实现位置
|
|
|
-
|
|
|
-- `knowhub/server.py:update_knowledge` - 更新评分逻辑
|
|
|
-- `knowhub/server.py:calculate_weighted_score` - 评分计算
|
|
|
-- `agent/tools/builtin/knowledge.py:knowledge_feedback` - 新增人类反馈工具
|
|
|
-
|
|
|
----
|
|
|
-
|
|
|
-### 2.3 分层存储(P0 必需)
|
|
|
-
|
|
|
-#### 知识状态机
|
|
|
-
|
|
|
-```
|
|
|
-active(活跃)→ stable(稳定)→ cold(冷藏)→ archived(归档)
|
|
|
- ↓
|
|
|
- deprecated(废弃)
|
|
|
-```
|
|
|
-
|
|
|
-#### 状态转换规则
|
|
|
-
|
|
|
-```python
|
|
|
-def calculate_state(knowledge):
|
|
|
- days_since_last_use = (now - knowledge["last_used"]).days
|
|
|
- usage_count = knowledge["implicit_feedback"]["search_count"]
|
|
|
-
|
|
|
- if days_since_last_use > 180 and usage_count < 5:
|
|
|
- return "archived" # 半年未用且使用少 → 归档
|
|
|
- elif days_since_last_use > 90:
|
|
|
- return "cold" # 3个月未用 → 冷藏
|
|
|
- elif usage_count > 20:
|
|
|
- return "active" # 使用频繁 → 活跃
|
|
|
- else:
|
|
|
- return "stable" # 默认稳定
|
|
|
-```
|
|
|
-
|
|
|
-#### 检索策略
|
|
|
-
|
|
|
-- 默认只检索 `active` + `stable`
|
|
|
-- 可选参数 `include_cold=true` 扩展到冷藏知识
|
|
|
-- `archived` 和 `deprecated` 不参与检索,但可通过 ID 访问
|
|
|
-
|
|
|
-#### 数据结构
|
|
|
-
|
|
|
-```python
|
|
|
-{
|
|
|
- "state": "active", # active/stable/cold/archived/deprecated
|
|
|
- "state_reason": "", # 状态变更原因
|
|
|
- "state_updated_at": "2026-03-17T12:00:00Z"
|
|
|
-}
|
|
|
-```
|
|
|
-
|
|
|
-#### 实现位置
|
|
|
-
|
|
|
-- `knowhub/server.py:update_knowledge_states` - 后台任务,每天更新
|
|
|
-- `knowhub/server.py:search_knowledge_api` - 检索时过滤状态
|
|
|
-- `knowhub/vector_store.py` - Milvus 查询添加状态过滤
|
|
|
-
|
|
|
----
|
|
|
-
|
|
|
-### 2.4 质量淘汰(P0 必需)
|
|
|
-
|
|
|
-#### 淘汰条件
|
|
|
-
|
|
|
-- `score < 2`
|
|
|
-- `harmful > helpful`
|
|
|
-- 存在超过 30 天
|
|
|
-
|
|
|
-#### 操作
|
|
|
-
|
|
|
-标记为 `deprecated`,不直接删除(可恢复)
|
|
|
-
|
|
|
-#### 实现
|
|
|
-
|
|
|
-```python
|
|
|
-async def prune_low_quality():
|
|
|
- """定期清理低质量知识"""
|
|
|
-
|
|
|
- low_quality = milvus_store.query(
|
|
|
- filter_expr='eval["score"] < 2 and eval["harmful"] > eval["helpful"]'
|
|
|
- )
|
|
|
-
|
|
|
- for k in low_quality:
|
|
|
- age_days = (now - k["created_at"]).days
|
|
|
- if age_days > 30:
|
|
|
- await knowledge_update(
|
|
|
- knowledge_id=k["id"],
|
|
|
- metadata={
|
|
|
- "state": "deprecated",
|
|
|
- "state_reason": "low_quality",
|
|
|
- "deprecated_at": now
|
|
|
- }
|
|
|
- )
|
|
|
-```
|
|
|
-
|
|
|
-#### 实现位置
|
|
|
-
|
|
|
-- `knowhub/server.py:prune_low_quality` - 后台任务,每天执行
|
|
|
-
|
|
|
----
|
|
|
-
|
|
|
-### 2.5 知识关系网络(P0)
|
|
|
-
|
|
|
-#### 数据结构
|
|
|
-
|
|
|
-```python
|
|
|
-{
|
|
|
- "relations": [
|
|
|
- {
|
|
|
- "target_id": "knowledge-20260310-c3d4",
|
|
|
- "relation_type": "complement", # duplicate/subset/superset/conflict/complement
|
|
|
- "direction": "bidirectional", # bidirectional/outgoing/incoming
|
|
|
- "confidence": 0.95,
|
|
|
- "reason": "两条知识互补,分别覆盖不同场景",
|
|
|
- "created_at": "2026-03-17T12:00:00Z",
|
|
|
- "created_by": "system", # system/human/agent
|
|
|
- "action_taken": "" # 可选:deprecated_target/merged/etc
|
|
|
- }
|
|
|
- ]
|
|
|
-}
|
|
|
-```
|
|
|
-
|
|
|
-#### 关系方向说明
|
|
|
-
|
|
|
-| 关系类型 | 方向性 | 说明 |
|
|
|
-|---------|--------|------|
|
|
|
-| `complement` | 双向 | 互补关系,建立双向链接 |
|
|
|
-| `duplicate` | 双向 | 完全重复 |
|
|
|
-| `subset` | 单向 | 本知识是目标的子集 |
|
|
|
-| `superset` | 单向 | 本知识是目标的超集 |
|
|
|
-| `conflict` | 双向 | 冲突关系 |
|
|
|
-
|
|
|
-#### 实现位置
|
|
|
-
|
|
|
-- `knowhub/server.py:create_knowledge_link` - 创建关系链接
|
|
|
-- `knowhub/server.py:get_related_knowledge` - 查询相关知识
|
|
|
-
|
|
|
----
|
|
|
-
|
|
|
-### 2.6 轻量级健康检查(P1 推荐)
|
|
|
-
|
|
|
-#### 目的
|
|
|
-
|
|
|
-检测保存时去重的漏判(兜底机制)
|
|
|
-
|
|
|
-#### 策略
|
|
|
-
|
|
|
-```python
|
|
|
-async def weekly_health_check():
|
|
|
- """每周检查新增知识的重复情况"""
|
|
|
-
|
|
|
- # 只检查最近7天新增的知识
|
|
|
- recent = query(filter=f'created_at > "{seven_days_ago}"')
|
|
|
-
|
|
|
- if len(recent) < 10:
|
|
|
- return # 新增太少,不值得检查
|
|
|
-
|
|
|
- # 使用向量聚类检测明显重复(阈值 0.90)
|
|
|
- clusters = await cluster_similar_knowledge(
|
|
|
- knowledge_list=recent,
|
|
|
- threshold=0.90
|
|
|
- )
|
|
|
-
|
|
|
- # 只报告,不自动处理
|
|
|
- if clusters:
|
|
|
- send_alert(f"发现 {len(clusters)} 组疑似重复,请人工审核")
|
|
|
-```
|
|
|
-
|
|
|
-#### 成本
|
|
|
-
|
|
|
-几乎为 0(只用向量聚类,不调用 LLM)
|
|
|
-
|
|
|
-#### 实现位置
|
|
|
-
|
|
|
-- `knowhub/server.py:weekly_health_check` - 后台任务,每周执行
|
|
|
-
|
|
|
----
|
|
|
-
|
|
|
-## 三、可选优化(P2)
|
|
|
-
|
|
|
-### 3.1 隐式反馈收集
|
|
|
-
|
|
|
-```python
|
|
|
-{
|
|
|
- "implicit_feedback": {
|
|
|
- "search_count": 156, # 被检索次数
|
|
|
- "click_count": 89, # 被选中使用次数
|
|
|
- "last_used": "2026-03-17",
|
|
|
- "avg_rank": 2.3 # 平均检索排名
|
|
|
- }
|
|
|
-}
|
|
|
-```
|
|
|
-
|
|
|
-**实现位置**:`knowhub/server.py:search_knowledge_api` - 返回结果时记录
|
|
|
-
|
|
|
-### 3.2 时间衰减机制
|
|
|
-
|
|
|
-```python
|
|
|
-def apply_time_decay(knowledge, current_time):
|
|
|
- age_days = (current_time - knowledge["created_at"]).days
|
|
|
-
|
|
|
- # 6个月后开始衰减,1年后降至50%
|
|
|
- if age_days > 180:
|
|
|
- decay_factor = max(0.5, 1 - (age_days - 180) / 365)
|
|
|
- knowledge["_search_score"] *= decay_factor
|
|
|
-
|
|
|
- return knowledge
|
|
|
-```
|
|
|
-
|
|
|
-**实现位置**:`knowhub/server.py:_llm_rerank` - 精排前应用衰减
|
|
|
-
|
|
|
-### 3.3 多维度反馈
|
|
|
-
|
|
|
-```python
|
|
|
-{
|
|
|
- "eval": {
|
|
|
- "dimensions": {
|
|
|
- "accuracy": 5, # 准确性
|
|
|
- "completeness": 4, # 完整性
|
|
|
- "clarity": 4, # 清晰度
|
|
|
- "timeliness": 3 # 时效性
|
|
|
- }
|
|
|
- }
|
|
|
-}
|
|
|
-```
|
|
|
-
|
|
|
-### 3.4 归因置信度
|
|
|
-
|
|
|
-对于任务成功/失败反馈,计算"这个结果有多大程度归因于该知识":
|
|
|
-
|
|
|
-```python
|
|
|
-async def calculate_attribution_confidence(
|
|
|
- knowledge_id: str,
|
|
|
- task_result: dict
|
|
|
-) -> float:
|
|
|
- """计算归因置信度"""
|
|
|
-
|
|
|
- # 因素1:该知识在任务中的使用程度
|
|
|
- usage_ratio = task_result["knowledge_usage"][knowledge_id] / task_result["total_steps"]
|
|
|
-
|
|
|
- # 因素2:是否是唯一使用的知识
|
|
|
- is_only_knowledge = len(task_result["used_knowledge_ids"]) == 1
|
|
|
-
|
|
|
- # 因素3:失败时的错误类型
|
|
|
- if task_result["status"] == "failed":
|
|
|
- error_type = task_result["error_type"]
|
|
|
- if error_type in ["network", "timeout", "rate_limit"]:
|
|
|
- return 0.2 # 环境问题,归因置信度低
|
|
|
- elif error_type in ["logic_error", "wrong_output"]:
|
|
|
- return 0.9 # 逻辑问题,归因置信度高
|
|
|
-
|
|
|
- # 综合计算
|
|
|
- if is_only_knowledge:
|
|
|
- return 0.9
|
|
|
- else:
|
|
|
- return 0.3 + 0.6 * usage_ratio
|
|
|
-```
|
|
|
-
|
|
|
-**实现位置**:`agent/core/runner.py` - 任务完成回调
|
|
|
-
|
|
|
-### 3.5 质量仪表盘
|
|
|
-
|
|
|
-```python
|
|
|
-@app.get("/api/knowledge/stats")
|
|
|
-async def knowledge_stats():
|
|
|
- """知识库质量统计"""
|
|
|
- return {
|
|
|
- "total": 1234,
|
|
|
- "by_score": {5: 234, 4: 567, 3: 345, 2: 67, 1: 21},
|
|
|
- "by_state": {"active": 800, "stable": 300, "cold": 100, "archived": 34},
|
|
|
- "low_quality": [...], # score < 3 的知识列表
|
|
|
- "stale": [...], # 6个月未使用的知识
|
|
|
- "top_helpful": [...], # helpful 最多的知识
|
|
|
- "needs_review": [...], # harmful > helpful 的知识
|
|
|
- "conflicts": [...] # 标记为冲突的知识对
|
|
|
- }
|
|
|
-```
|
|
|
-
|
|
|
-### 3.6 改进 slim v2(按需执行)
|
|
|
-
|
|
|
-使用聚类 + 分批处理,替换现有的一次性加载方案:
|
|
|
-
|
|
|
-```python
|
|
|
-@app.post("/api/knowledge/slim")
|
|
|
-async def slim_knowledge_v2(
|
|
|
- batch_size: int = 100,
|
|
|
- similarity_threshold: float = 0.85,
|
|
|
- model: str = "google/gemini-2.5-flash-lite"
|
|
|
-):
|
|
|
- """知识库瘦身 v2:分批聚类合并"""
|
|
|
-
|
|
|
- # 1. 聚类相似知识(只用向量,不用 LLM)
|
|
|
- clusters = await cluster_similar_knowledge(
|
|
|
- similarity_threshold=similarity_threshold
|
|
|
- )
|
|
|
-
|
|
|
- # 2. 对每个聚类调用 LLM 判断(分批处理)
|
|
|
- merged_count = 0
|
|
|
- for cluster in clusters:
|
|
|
- knowledge_list = [milvus_store.get_by_id(kid) for kid in cluster]
|
|
|
-
|
|
|
- # 只处理这个聚类的 2-5 条知识
|
|
|
- decision = await llm_merge_cluster(knowledge_list, model)
|
|
|
-
|
|
|
- if decision["should_merge"]:
|
|
|
- await execute_merge(decision)
|
|
|
- merged_count += 1
|
|
|
-
|
|
|
- return {"clusters_found": len(clusters), "merged": merged_count}
|
|
|
-```
|
|
|
-
|
|
|
-**成本**:~$0.5/次(按需执行)
|
|
|
-
|
|
|
----
|
|
|
-
|
|
|
-## 四、实施优先级与成本
|
|
|
-
|
|
|
-### P0(立即实施)
|
|
|
-
|
|
|
-| 机制 | 成本 | 实现位置 |
|
|
|
-|------|------|---------|
|
|
|
-| 保存时关系判断 | $0.15/年 | `agent/tools/builtin/knowledge.py:knowledge_save` |
|
|
|
-| 反馈来源区分 | $0 | `knowhub/server.py:update_knowledge` |
|
|
|
-| 分层存储 | $0 | `knowhub/server.py` + `knowhub/vector_store.py` |
|
|
|
-| 质量淘汰 | $0 | `knowhub/server.py:prune_low_quality` |
|
|
|
-| 知识关系网络 | $0 | `knowhub/server.py` |
|
|
|
-
|
|
|
-**P0 总成本**:~$0.15/年
|
|
|
-
|
|
|
-### P1(短期实施)
|
|
|
-
|
|
|
-| 机制 | 成本 | 实现位置 |
|
|
|
-|------|------|---------|
|
|
|
-| 轻量级健康检查 | ~$0 | `knowhub/server.py:weekly_health_check` |
|
|
|
-| 归因置信度 | $0 | `agent/core/runner.py` |
|
|
|
-
|
|
|
-### P2(按需实施)
|
|
|
-
|
|
|
-| 机制 | 成本 | 备注 |
|
|
|
-|------|------|------|
|
|
|
-| 隐式反馈收集 | $0 | 可选 |
|
|
|
-| 时间衰减机制 | $0 | 可选 |
|
|
|
-| 多维度反馈 | $0 | 可选 |
|
|
|
-| 质量仪表盘 | $0 | 可选 |
|
|
|
-| 改进 slim v2 | $0.5/次 | 按需执行 |
|
|
|
-| 定期全量去重 | $10-20/次 | 仅在保存时去重误判率 > 5% 时需要 |
|
|
|
-
|
|
|
----
|
|
|
-
|
|
|
-## 五、关键设计原则
|
|
|
-
|
|
|
-1. **实时防御优于事后清理**:保存时去重比定期去重更有效
|
|
|
-2. **分层判断降低成本**:向量 → 规则 → LLM,只在必要时用 LLM
|
|
|
-3. **反馈来源加权**:人类 (1.0) > Agent (0.6) > 任务结果 (0.3)
|
|
|
-4. **知识关系网络**:通过 `relations` 字段建立知识图谱
|
|
|
-5. **生命周期管理**:通过 `state` 字段管理知识的可见性
|
|
|
-6. **质量驱动淘汰**:基于反馈自动清理低质量知识
|
|
|
-
|
|
|
----
|
|
|
-
|
|
|
-## 六、完整数据结构
|
|
|
-
|
|
|
-```python
|
|
|
-{
|
|
|
- # 现有字段
|
|
|
- "id": "knowledge-20260317-a1b2",
|
|
|
- "message_id": "msg-xxx",
|
|
|
- "types": ["strategy", "tool"],
|
|
|
- "task": "在什么场景下要完成什么目标",
|
|
|
- "content": "核心知识内容",
|
|
|
- "tags": {"category": "preference"},
|
|
|
- "scopes": ["org:cybertogether"],
|
|
|
- "owner": "agent:research_agent",
|
|
|
- "resource_ids": ["code/selenium/login"],
|
|
|
- "source": {
|
|
|
- "name": "资源名称",
|
|
|
- "category": "exp",
|
|
|
- "urls": ["https://example.com"],
|
|
|
- "agent_id": "research_agent",
|
|
|
- "submitted_by": "user@example.com",
|
|
|
- "timestamp": "2026-03-17T12:00:00Z",
|
|
|
- "message_id": "msg-xxx"
|
|
|
- },
|
|
|
-
|
|
|
- # 改进的评估字段
|
|
|
- "eval": {
|
|
|
- "score": 4.2, # 加权综合评分(自动计算)
|
|
|
- "confidence": 0.9,
|
|
|
- "feedback_by_source": {
|
|
|
- "human": {"helpful": 3, "harmful": 0, "weight": 1.0, "last_feedback": "2026-03-17"},
|
|
|
- "agent_explicit": {"helpful": 12, "harmful": 2, "weight": 0.6, "last_feedback": "2026-03-17"},
|
|
|
- "task_outcome": {"success": 45, "failure": 5, "weight": 0.3, "last_feedback": "2026-03-17"}
|
|
|
- },
|
|
|
- "feedback_history": [
|
|
|
- {
|
|
|
- "source": "human",
|
|
|
- "type": "helpful",
|
|
|
- "comment": "非常准确",
|
|
|
- "timestamp": "2026-03-17T12:00:00Z",
|
|
|
- "user_id": "user@example.com"
|
|
|
- }
|
|
|
- ]
|
|
|
- },
|
|
|
-
|
|
|
- # 新增:隐式反馈(P2 可选)
|
|
|
- "implicit_feedback": {
|
|
|
- "search_count": 156,
|
|
|
- "click_count": 89,
|
|
|
- "last_used": "2026-03-17",
|
|
|
- "avg_rank": 2.3
|
|
|
- },
|
|
|
-
|
|
|
- # 新增:知识关系(P0)
|
|
|
- "relations": [
|
|
|
- {
|
|
|
- "target_id": "knowledge-xxx",
|
|
|
- "relation_type": "complement",
|
|
|
- "direction": "bidirectional",
|
|
|
- "confidence": 0.95,
|
|
|
- "reason": "两条知识互补,分别覆盖不同场景",
|
|
|
- "created_at": "2026-03-17T12:00:00Z",
|
|
|
- "created_by": "system",
|
|
|
- "action_taken": ""
|
|
|
- }
|
|
|
- ],
|
|
|
-
|
|
|
- # 新增:知识状态(P0)
|
|
|
- "state": "active", # active/stable/cold/archived/deprecated
|
|
|
- "state_reason": "",
|
|
|
- "state_updated_at": "2026-03-17T12:00:00Z",
|
|
|
-
|
|
|
- "created_at": "2026-03-17T12:00:00Z",
|
|
|
- "updated_at": "2026-03-17T12:00:00Z"
|
|
|
-}
|
|
|
-```
|
|
|
-
|
|
|
----
|
|
|
-
|
|
|
-## 七、实施路线图
|
|
|
-
|
|
|
-### 阶段 1:核心机制(1-2周)
|
|
|
-
|
|
|
-1. 修改知识数据结构(添加 `relations`, `state`, `feedback_by_source`)
|
|
|
-2. 实现保存时关系判断
|
|
|
-3. 实现反馈来源区分与加权评分
|
|
|
-4. 实现分层存储
|
|
|
-5. 实现质量淘汰
|
|
|
-
|
|
|
-### 阶段 2:监控与优化(1周)
|
|
|
-
|
|
|
-6. 实现轻量级健康检查
|
|
|
-7. 实现归因置信度
|
|
|
-8. 观察运行效果,调整参数
|
|
|
-
|
|
|
-### 阶段 3:增强功能(按需)
|
|
|
-
|
|
|
-9. 隐式反馈收集
|
|
|
-10. 时间衰减机制
|
|
|
-11. 多维度反馈
|
|
|
-12. 质量仪表盘
|
|
|
-13. 改进 slim v2
|
|
|
-
|
|
|
----
|
|
|
-
|
|
|
-## 八、风险与缓解
|
|
|
-
|
|
|
-### 风险 1:LLM 判断误判
|
|
|
-
|
|
|
-**影响**:可能误判为 duplicate 导致丢失有用知识
|
|
|
-
|
|
|
-**缓解**:
|
|
|
-- 使用分层判断,只在边界情况用 LLM
|
|
|
-- 设置置信度阈值,低于 0.8 时降级到更好的模型
|
|
|
-- 轻量级健康检查作为兜底
|
|
|
-
|
|
|
-### 风险 2:关系网络复杂度
|
|
|
-
|
|
|
-**影响**:知识关系可能形成复杂网络,难以维护
|
|
|
-
|
|
|
-**缓解**:
|
|
|
-- 初期只建立必要的关系(complement, conflict)
|
|
|
-- 提供可视化工具查看关系图
|
|
|
-- 定期清理无效关系
|
|
|
-
|
|
|
-### 风险 3:状态转换过于激进
|
|
|
-
|
|
|
-**影响**:有用的知识可能被过早归档
|
|
|
-
|
|
|
-**缓解**:
|
|
|
-- 保守的阈值设置(180天才归档)
|
|
|
-- 归档的知识仍可通过 ID 访问
|
|
|
-- 提供恢复接口
|
|
|
-
|
|
|
----
|
|
|
-
|
|
|
-## 九、成功指标
|
|
|
-
|
|
|
-### 定量指标
|
|
|
-
|
|
|
-- 知识库增长率:从当前 X 条/月降至 Y 条/月
|
|
|
-- 重复率:新增知识中重复率 < 5%
|
|
|
-- 低质量知识占比:score < 3 的知识 < 10%
|
|
|
-- 归档知识占比:archived 状态 < 20%
|
|
|
-
|
|
|
-### 定性指标
|
|
|
-
|
|
|
-- Agent 检索到的知识更相关
|
|
|
-- 知识质量反馈更准确
|
|
|
-- 知识库维护成本降低
|
|
|
-
|
|
|
----
|
|
|
-
|
|
|
-## 十、参考资料
|
|
|
-
|
|
|
-- 现有知识管理文档:`knowhub/docs/knowledge-management.md`
|
|
|
-- 决策记录:`knowhub/docs/decisions.md`
|
|
|
-- 资源存储文档:`knowhub/docs/resource-storage.md`
|