|
|
@@ -0,0 +1,364 @@
|
|
|
+# Part B:搜索→统一评估→按标签路由双表 Implementation Plan
|
|
|
+
|
|
|
+> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.
|
|
|
+
|
|
|
+**Goal:** 让一次搜索只做一次统一评估,再按评估产出的 `知识类型` 标签把帖子路由进 `search_process` / `search_tools`(两者都含则两表都写同一份评估 blob),评估去重改为跨表,杜绝按方向重复打分。
|
|
|
+
|
|
|
+**Architecture:** 现有 `stages/search_eval.py` 当前用 `--mode-type` 提前选单表。本计划把"选表"从"调用方提前指定"改为"评估标签自动路由"。新增两个纯函数到 `db.py`(`route_tables` 路由、`fetch_existing_eval_any` 跨表去重),`search_eval.run()` 改为评估一次 + 分组双表 upsert。`db.upsert_search_posts` 已支持 `table=` 参数,直接按组各调一次复用。
|
|
|
+
|
|
|
+**Tech Stack:** Python 3 + pymysql(同现有),无 pytest(本项目用独立脚本/`python -c` 断言/`--dry-run` 验证,见 CLAUDE.md)。
|
|
|
+
|
|
|
+参考 spec:`docs/superpowers/specs/2026-06-18-query-builder-design.md`(Part B)。
|
|
|
+
|
|
|
+---
|
|
|
+
|
|
|
+## File Structure
|
|
|
+
|
|
|
+- `examples/mode_workflow/db.py` — 新增 `route_tables()`(纯函数,知识类型→表名列表)、
|
|
|
+ `fetch_existing_eval_any()`(跨表评估去重)。这两个是数据/路由职责,归 db.py。
|
|
|
+- `examples/mode_workflow/stages/search_eval.py` — 改 `run()`:去掉 `--mode-type` 选表;
|
|
|
+ 评估去重用跨表版;评估后按 `route_tables` 分组、对每张目标表各 upsert 一次;`runs/` 落盘改中性目录。
|
|
|
+- `examples/mode_workflow/README.md` / `流程执行手册.md` — 更新"搜索方向无关、按标签落表"说明。
|
|
|
+
|
|
|
+---
|
|
|
+
|
|
|
+### Task 1: `route_tables` 纯路由函数
|
|
|
+
|
|
|
+**Files:**
|
|
|
+- Modify: `examples/mode_workflow/db.py`(在 `fetch_adopted_tools_cases` 之后、`# ── 评估去重` 注释段之前插入)
|
|
|
+
|
|
|
+- [ ] **Step 1: 写断言(先失败)**
|
|
|
+
|
|
|
+新建临时验证片段并运行(本项目用 `python -c` 风格断言代替 pytest):
|
|
|
+
|
|
|
+```bash
|
|
|
+cd /Users/max_liu/max_liu/company/Agent/examples/mode_workflow
|
|
|
+python3 - <<'PY'
|
|
|
+import db
|
|
|
+assert db.route_tables(["工序"]) == ["search_process"]
|
|
|
+assert db.route_tables(["工具"]) == ["search_tools"]
|
|
|
+assert db.route_tables(["工序","工具"]) == ["search_process","search_tools"]
|
|
|
+assert db.route_tables(["能力"]) == ["search_process"] # 能力 → 归 process
|
|
|
+assert db.route_tables(["能力","工具"]) == ["search_process","search_tools"]
|
|
|
+assert db.route_tables([]) == ["search_process"] # 兜底
|
|
|
+assert db.route_tables(None) == ["search_process"]
|
|
|
+print("✔ route_tables OK")
|
|
|
+PY
|
|
|
+```
|
|
|
+
|
|
|
+- [ ] **Step 2: 运行确认失败**
|
|
|
+
|
|
|
+Expected: `AttributeError: module 'db' has no attribute 'route_tables'`
|
|
|
+
|
|
|
+- [ ] **Step 3: 实现 `route_tables`**
|
|
|
+
|
|
|
+在 `db.py` 中插入:
|
|
|
+
|
|
|
+```python
|
|
|
+def route_tables(knowledge_types):
|
|
|
+ """知识类型标签 → 落表列表(有序去重)。
|
|
|
+ 工序/能力 → search_process;工具 → search_tools;两者都含写两表;空/None 兜底 search_process。
|
|
|
+ 评估是统一一套(同一 llm_evaluation blob),故同帖落多表不重复打分,只是多写一行。"""
|
|
|
+ kt = set(knowledge_types or [])
|
|
|
+ tables = []
|
|
|
+ if (kt & {"工序", "能力"}) or not kt:
|
|
|
+ tables.append("search_process")
|
|
|
+ if kt & {"工具"}:
|
|
|
+ tables.append("search_tools")
|
|
|
+ return tables
|
|
|
+```
|
|
|
+
|
|
|
+- [ ] **Step 4: 运行确认通过**
|
|
|
+
|
|
|
+Run: 重跑 Step 1 的片段
|
|
|
+Expected: `✔ route_tables OK`
|
|
|
+
|
|
|
+- [ ] **Step 5: 提交**
|
|
|
+
|
|
|
+```bash
|
|
|
+cd /Users/max_liu/max_liu/company/Agent
|
|
|
+git add examples/mode_workflow/db.py
|
|
|
+git commit -m "feat(mode_workflow): 新增 route_tables 知识类型→落表路由(工序/能力→process,工具→tools)"
|
|
|
+```
|
|
|
+
|
|
|
+---
|
|
|
+
|
|
|
+### Task 2: `fetch_existing_eval_any` 跨表评估去重
|
|
|
+
|
|
|
+**Files:**
|
|
|
+- Modify: `examples/mode_workflow/db.py`(紧跟现有 `fetch_existing_eval` 函数之后,约 `db.py:945` 后)
|
|
|
+
|
|
|
+- [ ] **Step 1: 写断言(先失败)**
|
|
|
+
|
|
|
+```bash
|
|
|
+cd /Users/max_liu/max_liu/company/Agent/examples/mode_workflow
|
|
|
+python3 - <<'PY'
|
|
|
+import db
|
|
|
+# 拿一个库里已评过的 case 验证:任一表评过即返回 blob;不存在的 case 返回 None
|
|
|
+c = db._conn()
|
|
|
+with c.cursor() as cur:
|
|
|
+ cur.execute("SELECT case_id FROM search_process WHERE llm_evaluation IS NOT NULL LIMIT 1")
|
|
|
+ row = cur.fetchone()
|
|
|
+c.close()
|
|
|
+if row:
|
|
|
+ e = db.fetch_existing_eval_any(row["case_id"])
|
|
|
+ assert isinstance(e, dict) and isinstance(e.get("相关性"), dict), "应返回有效评估 blob"
|
|
|
+assert db.fetch_existing_eval_any("__不存在的case__") is None
|
|
|
+print("✔ fetch_existing_eval_any OK")
|
|
|
+PY
|
|
|
+```
|
|
|
+
|
|
|
+- [ ] **Step 2: 运行确认失败**
|
|
|
+
|
|
|
+Expected: `AttributeError: module 'db' has no attribute 'fetch_existing_eval_any'`
|
|
|
+
|
|
|
+- [ ] **Step 3: 实现**
|
|
|
+
|
|
|
+在 `db.py` 的 `fetch_existing_eval` 函数定义之后插入:
|
|
|
+
|
|
|
+```python
|
|
|
+def fetch_existing_eval_any(case_id):
|
|
|
+ """跨两张搜索表找该 case 最近一条有效评估 blob。
|
|
|
+ 评估与表无关(统一一套),任一表评过即可复用,避免同帖在两表各评一次。无则 None。"""
|
|
|
+ for table in ("search_process", "search_tools"):
|
|
|
+ e = fetch_existing_eval(case_id, table)
|
|
|
+ if e:
|
|
|
+ return e
|
|
|
+ return None
|
|
|
+```
|
|
|
+
|
|
|
+- [ ] **Step 4: 运行确认通过**
|
|
|
+
|
|
|
+Run: 重跑 Step 1 片段
|
|
|
+Expected: `✔ fetch_existing_eval_any OK`
|
|
|
+
|
|
|
+- [ ] **Step 5: 提交**
|
|
|
+
|
|
|
+```bash
|
|
|
+cd /Users/max_liu/max_liu/company/Agent
|
|
|
+git add examples/mode_workflow/db.py
|
|
|
+git commit -m "feat(mode_workflow): 新增 fetch_existing_eval_any 跨表评估去重(评估与表无关)"
|
|
|
+```
|
|
|
+
|
|
|
+---
|
|
|
+
|
|
|
+### Task 3: `search_eval.run()` 改为统一评估 + 标签路由双表
|
|
|
+
|
|
|
+**Files:**
|
|
|
+- Modify: `examples/mode_workflow/stages/search_eval.py`(`run()` 内:删第 98 行选表;改第 110 行去重;
|
|
|
+ 改第 148–159 行入库+落盘)
|
|
|
+
|
|
|
+- [ ] **Step 1: 删掉提前选表**
|
|
|
+
|
|
|
+删除 `run()` 内这一行(当前 `search_eval.py:98`):
|
|
|
+
|
|
|
+```python
|
|
|
+ table = "search_tools" if args.mode_type == "工具" else "search_process"
|
|
|
+```
|
|
|
+
|
|
|
+- [ ] **Step 2: 评估去重改跨表**
|
|
|
+
|
|
|
+把去重循环里(当前 `search_eval.py:109-110`):
|
|
|
+
|
|
|
+```python
|
|
|
+ for s in sources:
|
|
|
+ e = db.fetch_existing_eval(s["case_id"], table)
|
|
|
+```
|
|
|
+
|
|
|
+改为:
|
|
|
+
|
|
|
+```python
|
|
|
+ for s in sources:
|
|
|
+ e = db.fetch_existing_eval_any(s["case_id"])
|
|
|
+```
|
|
|
+
|
|
|
+- [ ] **Step 3: 入库改为按标签路由双表 + 落盘改中性目录**
|
|
|
+
|
|
|
+把当前 `search_eval.py:148-159`(`for s in sources: s.pop(...)` 到写 runs JSON)整段:
|
|
|
+
|
|
|
+```python
|
|
|
+ for s in sources:
|
|
|
+ s.pop("_image_data_urls", None)
|
|
|
+
|
|
|
+ n = db.upsert_search_posts(args.query_id, args.query, sources, table=table)
|
|
|
+ print(f"🗄️ {table} 入库 {n} 行 · 方向 {args.mode_type or '工序'} · 评估成本 ${cost:.4f}")
|
|
|
+
|
|
|
+ out_dir = MW / "runs" / table
|
|
|
+ out_dir.mkdir(parents=True, exist_ok=True)
|
|
|
+ (out_dir / f"{args.query_id}.json").write_text(json.dumps({
|
|
|
+ "query_id": args.query_id, "query": args.query, "phrasings": phrasings,
|
|
|
+ "platforms": platforms, "total": len(sources), "results": sources,
|
|
|
+ }, ensure_ascii=False, indent=2), encoding="utf-8")
|
|
|
+ return 0
|
|
|
+```
|
|
|
+
|
|
|
+替换为:
|
|
|
+
|
|
|
+```python
|
|
|
+ for s in sources:
|
|
|
+ s.pop("_image_data_urls", None)
|
|
|
+
|
|
|
+ # 按评估标签路由:工序/能力→search_process,工具→search_tools,两者都含→两表都写同一 blob。
|
|
|
+ # 评估只做一次(统一一套),双表只是多写一行,不重复打分。
|
|
|
+ routed = {"search_process": [], "search_tools": []}
|
|
|
+ for s in sources:
|
|
|
+ kt = (s.get("llm_evaluation") or {}).get("知识类型") or []
|
|
|
+ for t in db.route_tables(kt):
|
|
|
+ routed[t].append(s)
|
|
|
+ total = 0
|
|
|
+ for t, group in routed.items():
|
|
|
+ if group:
|
|
|
+ n = db.upsert_search_posts(args.query_id, args.query, group, table=t)
|
|
|
+ total += n
|
|
|
+ print(f"🗄️ {t} 入库 {n} 行")
|
|
|
+ print(f"📊 评估成本 ${cost:.4f} · 共写 {total} 行(双表含同帖重复)")
|
|
|
+
|
|
|
+ out_dir = MW / "runs" / "search"
|
|
|
+ out_dir.mkdir(parents=True, exist_ok=True)
|
|
|
+ (out_dir / f"{args.query_id}.json").write_text(json.dumps({
|
|
|
+ "query_id": args.query_id, "query": args.query, "phrasings": phrasings,
|
|
|
+ "platforms": platforms, "total": len(sources), "results": sources,
|
|
|
+ }, ensure_ascii=False, indent=2), encoding="utf-8")
|
|
|
+ return 0
|
|
|
+```
|
|
|
+
|
|
|
+- [ ] **Step 4: 更新 `--mode-type` 帮助文字为"兼容保留、不再决定路由"**
|
|
|
+
|
|
|
+把 `main()` 里(当前 `search_eval.py:168-169`):
|
|
|
+
|
|
|
+```python
|
|
|
+ p.add_argument("--mode-type", default="", choices=["", "工序", "工具"],
|
|
|
+ help="解构方向,决定写哪张表(工具 → search_tools;其余 → search_process)")
|
|
|
+```
|
|
|
+
|
|
|
+改为:
|
|
|
+
|
|
|
+```python
|
|
|
+ p.add_argument("--mode-type", default="", choices=["", "工序", "工具"],
|
|
|
+ help="(兼容保留,已不决定路由)落表现由评估的 知识类型 标签自动路由")
|
|
|
+```
|
|
|
+
|
|
|
+- [ ] **Step 5: 编译确认无语法错**
|
|
|
+
|
|
|
+Run:
|
|
|
+```bash
|
|
|
+cd /Users/max_liu/max_liu/company/Agent/examples/mode_workflow
|
|
|
+python3 -m py_compile stages/search_eval.py db.py && echo OK
|
|
|
+```
|
|
|
+Expected: `OK`
|
|
|
+
|
|
|
+- [ ] **Step 6: 提交**
|
|
|
+
|
|
|
+```bash
|
|
|
+cd /Users/max_liu/max_liu/company/Agent
|
|
|
+git add examples/mode_workflow/stages/search_eval.py
|
|
|
+git commit -m "feat(mode_workflow): search_eval 统一评估+按知识类型路由双表,评估去重跨表"
|
|
|
+```
|
|
|
+
|
|
|
+---
|
|
|
+
|
|
|
+### Task 4: 端到端真实验证(小批搜索)
|
|
|
+
|
|
|
+**Files:** 无(运行验证)
|
|
|
+
|
|
|
+- [ ] **Step 1: 跑一个小搜索(默认 xhs+gzh,量小省钱)**
|
|
|
+
|
|
|
+```bash
|
|
|
+cd /Users/max_liu/max_liu/company/Agent/examples/mode_workflow
|
|
|
+python3 stages/search_eval.py --query-id qtestB --query "AI 写实人像 怎么做" \
|
|
|
+ --platforms xhs,gzh --max-count 5
|
|
|
+```
|
|
|
+Expected: 日志出现 `🗄️ search_process 入库 N 行`,若有工具帖则另有 `🗄️ search_tools 入库 M 行`,
|
|
|
+末行 `📊 评估成本 $... · 共写 ... 行`。
|
|
|
+
|
|
|
+- [ ] **Step 2: 校验双标签帖两表都在、blob 一致、且只评一次**
|
|
|
+
|
|
|
+```bash
|
|
|
+python3 - <<'PY'
|
|
|
+import db, json
|
|
|
+c = db._conn()
|
|
|
+with c.cursor() as cur:
|
|
|
+ # 该 query 下同时落入两表的 case = 含双标签
|
|
|
+ cur.execute("""SELECT p.case_id FROM search_process p
|
|
|
+ JOIN search_tools t ON p.case_id=t.case_id AND p.query_id=t.query_id
|
|
|
+ WHERE p.query_id='qtestB' LIMIT 1""")
|
|
|
+ row = cur.fetchone()
|
|
|
+ if row:
|
|
|
+ cid = row["case_id"]
|
|
|
+ cur.execute("SELECT llm_evaluation FROM search_process WHERE query_id='qtestB' AND case_id=%s", (cid,))
|
|
|
+ a = cur.fetchone()["llm_evaluation"]
|
|
|
+ cur.execute("SELECT llm_evaluation FROM search_tools WHERE query_id='qtestB' AND case_id=%s", (cid,))
|
|
|
+ b = cur.fetchone()["llm_evaluation"]
|
|
|
+ assert a == b, "两表同帖 llm_evaluation 应为同一份(只评一次)"
|
|
|
+ print(f"✔ 双标签帖 {cid} 两表 blob 一致")
|
|
|
+ else:
|
|
|
+ print("ℹ 本批无双标签帖(正常),仅验证单表路由")
|
|
|
+c.close()
|
|
|
+PY
|
|
|
+```
|
|
|
+Expected: `✔ 双标签帖 ... 两表 blob 一致` 或 `ℹ 本批无双标签帖`。
|
|
|
+
|
|
|
+- [ ] **Step 3: 清理测试数据**
|
|
|
+
|
|
|
+```bash
|
|
|
+python3 - <<'PY'
|
|
|
+import db
|
|
|
+c = db._conn()
|
|
|
+with c.cursor() as cur:
|
|
|
+ for t in ("search_process","search_tools"):
|
|
|
+ cur.execute(f"DELETE FROM {t} WHERE query_id='qtestB'")
|
|
|
+c.close(); print("✔ 已清理 qtestB")
|
|
|
+PY
|
|
|
+```
|
|
|
+
|
|
|
+- [ ] **Step 4: 提交(无代码改动则跳过)**
|
|
|
+
|
|
|
+本任务仅验证,无提交。
|
|
|
+
|
|
|
+---
|
|
|
+
|
|
|
+### Task 5: 文档更新
|
|
|
+
|
|
|
+**Files:**
|
|
|
+- Modify: `examples/mode_workflow/README.md`(数据流段)
|
|
|
+- Modify: `examples/mode_workflow/流程执行手册.md`(步骤 1+2 要点)
|
|
|
+
|
|
|
+- [ ] **Step 1: README 数据流改为"方向无关 + 标签路由"**
|
|
|
+
|
|
|
+把 README「数据流」代码块里:
|
|
|
+
|
|
|
+```
|
|
|
+新建搜索(UI) → server 子进程 stages/search_eval.py → search_process / search_tools(方向分表)
|
|
|
+```
|
|
|
+
|
|
|
+改为:
|
|
|
+
|
|
|
+```
|
|
|
+新建搜索(UI) → server 子进程 stages/search_eval.py → 统一评估 → 按 知识类型 标签路由
|
|
|
+ (工序/能力→search_process,工具→search_tools,两者都含写两表)
|
|
|
+```
|
|
|
+
|
|
|
+- [ ] **Step 2: 手册步骤 2 加一句路由说明**
|
|
|
+
|
|
|
+在 `流程执行手册.md`「步骤 1 + 2」的"要点"列表末尾追加一行:
|
|
|
+
|
|
|
+```
|
|
|
+- 入库方向由评估的 `知识类型` 标签自动决定(工序/能力→search_process,工具→search_tools,两者都含两表都写);
|
|
|
+ `--mode-type` 已不决定落表。
|
|
|
+```
|
|
|
+
|
|
|
+- [ ] **Step 3: 提交**
|
|
|
+
|
|
|
+```bash
|
|
|
+cd /Users/max_liu/max_liu/company/Agent
|
|
|
+git add examples/mode_workflow/README.md examples/mode_workflow/流程执行手册.md
|
|
|
+git commit -m "docs(mode_workflow): 搜索入库改为按知识类型标签路由双表"
|
|
|
+```
|
|
|
+
|
|
|
+---
|
|
|
+
|
|
|
+## Self-Review
|
|
|
+
|
|
|
+- **Spec coverage(Part B)**:req#2 标签路由(工序/能力→process、工具→tools、both→双表同 blob、空→兜底 process)→ Task 1+3 ✓;评估去重跨表 → Task 2+3 ✓;零重复打分(评估一次)→ Task 3 的"评估后再路由"结构保证 ✓;`--mode-type` 退化 → Task 3 Step 4 ✓。req#1 的"默认 xhs,gzh×20 / 触发方式"属前端+run_search,归 Part A 计划,不在本计划。
|
|
|
+- **Placeholder scan**:无 TBD/TODO;每步含可运行命令与完整代码。
|
|
|
+- **Type consistency**:`route_tables` 返回 `list[str]`,Task 3 以 `for t in db.route_tables(kt)` 消费一致;`fetch_existing_eval_any(case_id)` 单参,Task 3 Step 2 调用一致;`upsert_search_posts(query_id, query_text, results, table=)` 签名与现有一致。
|