sug_v6_1_2_8.py 流程分析文档

📋 概述

sug_v6_1_2_8.py 是一个基于 LLM Agent 的智能搜索查询优化工具，主要用于小红书平台的搜索优化。通过多轮迭代的方式，从原始查询出发，逐步扩展和优化搜索词，最终获取高质量的搜索结果。

版本: v6.1.2.8 核心模型: google/gemini-2.5-flash 主要特性:

🔄 多轮迭代优化
🤖 多 Agent 协作
📊 相关度评分系统
🔍 小红书搜索集成
📈 可视化支持

🏗️ 整体架构

架构图

原始问题(o)
    ↓
[初始化阶段]
    ├─ 分词 → seg_list
    ├─ 评估分词相关度
    ├─ 构建 word_list_1
    ├─ 构建 q_list_1
    └─ 构建 seed_list
    ↓
[第1轮迭代]
    ├─ 请求 sug (建议词)
    ├─ 评估 sug 相关度
    ├─ 构建 search_list (高分sug搜索)
    ├─ 为 seed 加词 → q_list_next
    ├─ 更新 seed_list
    └─ 保存搜索结果
    ↓
[第2轮迭代] ...
    ↓
[第N轮迭代] ...
    ↓
[输出结果 + 可视化]

核心组件

数据模型层 - 定义所有数据结构（Seg, Word, Q, Sug, Seed, Post, Search）
Agent 层 - 三个专家 Agent（分词、相关度评估、加词选择）
流程控制层 - 初始化、轮次迭代、主循环
外部服务层 - 小红书 API 集成（搜索推荐、搜索）

📦 数据模型

核心数据结构

1. Seg (分词)

class Seg(BaseModel):
    text: str                    # 分词文本
    score_with_o: float = 0.0    # 与原始问题的评分
    reason: str = ""             # 评分理由
    from_o: str = ""             # 原始问题

用途: 存储原始问题分词后的每个词单元

2. Word (词)

class Word(BaseModel):
    text: str                    # 词文本
    score_with_o: float = 0.0    # 与原始问题的评分
    from_o: str = ""             # 原始问题

用途: 词库，用于后续组合新的查询词

3. Q (查询)

class Q(BaseModel):
    text: str                    # 查询文本
    score_with_o: float = 0.0    # 与原始问题的评分
    reason: str = ""             # 评分理由
    from_source: str = ""        # 来源: seg/sug/add

用途: 待处理的查询队列，每轮从 q_list 中取 query 进行处理

4. Sug (建议词)

class Sug(BaseModel):
    text: str                    # 建议词文本
    score_with_o: float = 0.0    # 与原始问题的评分
    reason: str = ""             # 评分理由
    from_q: QFromQ | None        # 来自哪个 q

用途: 存储从小红书 API 获取的建议词

5. Seed (种子)

class Seed(BaseModel):
    text: str                    # 种子文本
    added_words: list[str]       # 已添加的词
    from_type: str = ""          # 来源: seg/sug
    score_with_o: float = 0.0    # 与原始问题的评分

用途: 用于加词扩展的基础词，记录已经添加过的词以避免重复

6. Post (帖子)

class Post(BaseModel):
    title: str                   # 标题
    body_text: str               # 正文
    type: str = "normal"         # 类型: video/normal
    images: list[str]            # 图片URL列表
    video: str = ""              # 视频URL
    interact_info: dict          # 互动信息(点赞/收藏/评论/分享)
    note_id: str                 # 笔记ID
    note_url: str                # 笔记URL

用途: 存储小红书搜索结果的帖子详情

7. Search (搜索结果)

class Search(Sug):
    post_list: list[Post]        # 搜索到的帖子列表

用途: 继承 Sug，附加实际搜索到的帖子数据

8. RunContext (运行上下文)

class RunContext(BaseModel):
    version: str                 # 版本号
    input_files: dict            # 输入文件路径
    c: str                       # 原始需求
    o: str                       # 原始问题
    log_url: str                 # 日志URL
    log_dir: str                 # 日志目录
    rounds: list[dict]           # 每轮的详细数据
    final_output: str | None     # 最终结果

用途: 记录整个运行过程的上下文信息和中间结果

🤖 Agent 系统

Agent 1: 分词专家 (word_segmenter)

功能: 将原始问题拆分成有意义的最小单元

输入: 原始查询文本输出:

class WordSegmentation:
    words: list[str]        # 分词结果列表
    reasoning: str          # 分词理由

分词原则:

保留有搜索意义的词汇
拆分成独立的概念
保留专业术语的完整性
去除虚词（的、吗、呢等）

示例:

输入: "如何获取能体现川西秋季特色的高质量风光摄影素材？"
输出: ["川西", "秋季", "风光摄影", "素材"]

Agent 2: 相关度评估专家 (relevance_evaluator)

功能: 评估文本与原始问题的匹配程度

输入: 原始问题 + 待评估文本输出:

class RelevanceEvaluation:
    relevance_score: float  # 0-1的相关性分数
    reason: str            # 评估理由

评估标准:

主题相关性
要素覆盖度
意图匹配度

示例:

原始问题: "川西秋季摄影"
待评估: "川西旅游攻略"
输出: score=0.75, reason="与川西相关但缺少秋季和摄影要素"

Agent 3: 加词选择专家 (word_selector)

功能: 从候选词中选择最合适的词与 seed 组合

输入: 原始问题 + 当前 seed + 候选词列表输出:

class WordSelection:
    selected_word: str       # 选择的词
    combined_query: str      # 组合后的新query
    reasoning: str           # 选择理由

选择原则:

选择与当前 seed 最相关的词
组合后的 query 语义通顺
符合搜索习惯
优先选择能扩展搜索范围的词

示例:

seed: "川西"
候选词: ["秋季", "摄影", "旅游"]
输出: selected_word="秋季", combined_query="川西秋季"

🔄 核心流程

阶段 0: 初始化 (initialize)

目标: 从原始问题创建初始数据结构

流程:

步骤1: 分词
o → [word_segmenter] → WordSegmentation → seg_list

步骤2: 评估分词
for each seg in seg_list:
    seg + o → [relevance_evaluator] → score + reason
    更新 seg.score_with_o, seg.reason

步骤3: 构建 word_list_1
seg_list → word_list_1 (直接转换)

步骤4: 构建 q_list_1
seg_list → q_list_1 (from_source="seg")

步骤5: 构建 seed_list
seg_list → seed_list (from_type="seg")

输入:

o: 原始问题（例如: "如何获取川西秋季风光摄影素材？"）

输出:

seg_list: 分词结果列表
word_list_1: 初始词库
q_list_1: 第一轮待处理查询列表
seed_list: 初始种子列表

示例数据流:

o = "川西秋季摄影素材"
    ↓
seg_list = [
    Seg(text="川西", score_with_o=0.85),
    Seg(text="秋季", score_with_o=0.90),
    Seg(text="摄影", score_with_o=0.88),
    Seg(text="素材", score_with_o=0.75)
]
    ↓
word_list_1 = [Word("川西"), Word("秋季"), ...]
q_list_1 = [Q("川西"), Q("秋季"), ...]
seed_list = [Seed("川西"), Seed("秋季"), ...]

阶段 N: 轮次迭代 (run_round)

目标: 基于当前 q_list 扩展搜索，生成下一轮的数据

输入:

round_num: 轮次编号
q_list: 当前轮的查询列表
word_list: 当前词库
seed_list: 当前种子列表
sug_threshold: 建议词阈值（默认 0.7）

输出:

word_list_next: 下一轮词库
q_list_next: 下一轮查询列表
seed_list_next: 下一轮种子列表
search_list: 本轮搜索结果

步骤1: 请求建议词

for each q in q_list:
    sug_texts = xiaohongshu_api.get_recommendations(q.text)
    for sug_text in sug_texts:
        sug_list.append(Sug(
            text=sug_text,
            from_q=QFromQ(text=q.text, score=q.score_with_o)
        ))

并发处理: 所有 q 的请求可以并发执行

数据流:

q_list = [Q("川西"), Q("秋季")]
    ↓ [小红书API]
sug_list_list = [
    [Sug("川西旅游"), Sug("川西攻略"), ...],  # 来自 "川西"
    [Sug("秋季景色"), Sug("秋季摄影"), ...]   # 来自 "秋季"
]

步骤2: 评估建议词

async def evaluate_sug(sug: Sug) -> Sug:
    sug.score_with_o, sug.reason = await evaluate_with_o(sug.text, o)
    return sug

# 并发评估所有 sug
await asyncio.gather(*[evaluate_sug(sug) for sug in all_sugs])

评估标准: 使用 relevance_evaluator Agent

数据流:

Sug("川西旅游") + o → score=0.75, reason="..."
Sug("秋季摄影") + o → score=0.92, reason="..."

步骤3: 构建 search_list（搜索高分建议词）

high_score_sugs = [sug for sug in all_sugs if sug.score_with_o > sug_threshold]

async def search_for_sug(sug: Sug) -> Search:
    result = xiaohongshu_search.search(sug.text)
    posts = process_notes(result)
    return Search(text=sug.text, post_list=posts, ...)

search_list = await asyncio.gather(*[search_for_sug(sug) for sug in high_score_sugs])

阈值过滤: 只搜索评分 > sug_threshold 的建议词

并发搜索: 所有高分 sug 并发搜索

数据流:

high_score_sugs = [Sug("秋季摄影", score=0.92), ...]
    ↓ [小红书搜索API]
search_list = [
    Search(text="秋季摄影", post_list=[Post(...), ...])
]

步骤4: 构建 word_list_next

word_list_next = word_list.copy()  # 暂时直接复制

说明: 当前版本词库保持不变，未来可扩展从 sug 中提取新词

步骤5: 构建 q_list_next

5.1 为每个 seed 加词

for each seed in seed_list:
    # 过滤候选词
    candidate_words = [w for w in word_list_next
                       if w.text not in seed.text
                       and w.text not in seed.added_words]

    # Agent 选词
    selection_input = f"""
    原始问题: {o}
    当前Seed: {seed.text}
    候选词: {candidate_words}
    """
    result = await Runner.run(word_selector, selection_input)

    # 创建新 query
    new_q = Q(
        text=result.combined_query,
        score_with_o=...,
        from_source="add"
    )
    q_list_next.append(new_q)

    # 更新 seed
    seed.added_words.append(result.selected_word)

关键逻辑:

避免重复: 词不在 seed.text 中且未被添加过
Agent 智能选择: 使用 word_selector 选择最佳组合
评估新 query: 评估组合后的 query 与原始问题的相关度

示例:

seed = Seed("川西", added_words=[])
candidate_words = ["秋季", "摄影"]
    ↓ [word_selector]
selected_word = "秋季"
combined_query = "川西秋季"
    ↓ [relevance_evaluator]
new_q = Q("川西秋季", score=0.88, from_source="add")

5.2 高分 sug 加入 q_list_next

for sug in all_sugs:
    if sug.score_with_o > sug.from_q.score_with_o:
        new_q = Q(
            text=sug.text,
            score_with_o=sug.score_with_o,
            from_source="sug"
        )
        q_list_next.append(new_q)

条件: sug 分数 > 来源 query 分数

示例:

sug = Sug("秋季摄影技巧", score=0.92, from_q=Q("秋季", score=0.85))
    ↓ (0.92 > 0.85)
q_list_next.append(Q("秋季摄影技巧", score=0.92, from_source="sug"))

步骤6: 更新 seed_list

seed_list_next = seed_list.copy()  # 保留原有 seed

for sug in all_sugs:
    if (sug.score_with_o > sug.from_q.score_with_o
        and sug.text not in existing_seed_texts):
        new_seed = Seed(
            text=sug.text,
            from_type="sug",
            score_with_o=sug.score_with_o
        )
        seed_list_next.append(new_seed)

条件:

sug 分数 > 来源 query 分数
sug 未在 seed_list 中出现过

示例:

sug = Sug("川西秋季攻略", score=0.90, from_q=Q("川西", score=0.85))
    ↓ (0.90 > 0.85 且未重复)
seed_list_next.append(Seed("川西秋季攻略", from_type="sug"))

主循环 (iterative_loop)

流程控制:

# 初始化
seg_list, word_list, q_list, seed_list = await initialize(o, context)

# 迭代
round_num = 1
while q_list and round_num <= max_rounds:
    word_list, q_list, seed_list, search_list = await run_round(
        round_num, q_list, word_list, seed_list, ...
    )
    all_search_list.extend(search_list)
    round_num += 1

return all_search_list

终止条件:

q_list 为空（没有更多查询需要处理）
达到 max_rounds 限制

数据累积: 所有轮次的 search_list 合并到 all_search_list

📊 数据流图

完整数据流

输入:
├─ input_dir/context.md  (原始需求 c)
└─ input_dir/q.md        (原始问题 o)
    ↓
[初始化]
o → seg_list → word_list_1, q_list_1, seed_list
    ↓
[第1轮]
q_list_1 → sug_list_1 → search_list_1
         → q_list_2, seed_list_2 (通过加词+高分sug)
    ↓
[第2轮]
q_list_2 → sug_list_2 → search_list_2
         → q_list_3, seed_list_3
    ↓
[第N轮] ...
    ↓
输出:
├─ all_search_list (所有搜索结果)
├─ log_dir/run_context.json (运行上下文)
├─ log_dir/search_results.json (详细搜索结果)
└─ log_dir/visualization.html (可视化HTML)

每轮数据变化

轮次输入                          轮次输出
┌─────────────────┐             ┌─────────────────┐
│ q_list          │──┐          │ q_list_next     │
│ word_list       │  │          │ word_list_next  │
│ seed_list       │  │          │ seed_list_next  │
└─────────────────┘  │          │ search_list     │
                     │          └─────────────────┘
                     ↓
            ┌──────────────────┐
            │   run_round()    │
            │                  │
            │ 1. 请求sug       │
            │ 2. 评估sug       │
            │ 3. 搜索高分sug   │
            │ 4. 为seed加词    │
            │ 5. 构建q_next    │
            │ 6. 更新seed_list │
            └──────────────────┘

🎯 关键算法

1. 相关度评分机制

评分函数: evaluate_with_o(text, o)

输入:

text: 待评估文本
o: 原始问题

输出: (score, reason)

实现:

async def evaluate_with_o(text: str, o: str) -> tuple[float, str]:
    eval_input = f"""
    <原始问题>{o}</原始问题>
    <当前文本>{text}</当前文本>
    请评估当前文本与原始问题的相关度。
    """
    result = await Runner.run(relevance_evaluator, eval_input)
    return result.final_output.relevance_score, result.final_output.reason

应用场景:

评估分词与原始问题的相关度
评估 sug 与原始问题的相关度
评估新组合 query 与原始问题的相关度

2. 加词策略

目标: 从词库中为 seed 选择最佳词进行组合

候选词过滤:

candidate_words = [
    w for w in word_list
    if w.text not in seed.text           # 词不在seed中
    and w.text not in seed.added_words   # 词未被添加过
]

智能选择:

selection_input = f"""
<原始问题>{o}</原始问题>
<当前Seed>{seed.text}</当前Seed>
<候选词列表>{', '.join([w.text for w in candidate_words])}</候选词列表>
请从候选词中选择一个最合适的词，与当前seed组合成新的query。
"""
result = await Runner.run(word_selector, selection_input)

验证和评估:

# 验证选择的词在候选列表中
if selection.selected_word not in [w.text for w in candidate_words]:
    continue

# 评估组合后的query
new_q_score, new_q_reason = await evaluate_with_o(
    selection.combined_query, o
)

3. Sug 晋升机制

晋升到 q_list 的条件:

if sug.score_with_o > sug.from_q.score_with_o:
    q_list_next.append(Q(
        text=sug.text,
        score_with_o=sug.score_with_o,
        from_source="sug"
    ))

晋升到 seed_list 的条件:

if (sug.score_with_o > sug.from_q.score_with_o
    and sug.text not in existing_seed_texts):
    seed_list_next.append(Seed(
        text=sug.text,
        from_type="sug",
        score_with_o=sug.score_with_o
    ))

逻辑: 只有当 sug 的评分超过其来源 query 时，才认为 sug 是更优的查询词

4. 搜索阈值过滤

目标: 只搜索高质量的建议词

实现:

high_score_sugs = [
    sug for sug in all_sugs
    if sug.score_with_o > sug_threshold
]

# 并发搜索
search_list = await asyncio.gather(*[
    search_for_sug(sug) for sug in high_score_sugs
])

默认阈值: 0.7（可通过 --sug-threshold 参数调整）

🔧 外部服务集成

1. 小红书搜索推荐 API

类: XiaohongshuSearchRecommendations

方法: get_recommendations(keyword: str) -> list[str]

功能: 获取指定关键词的搜索建议词

使用场景: 在每轮中为 q_list 中的每个 query 请求建议词

2. 小红书搜索 API

类: XiaohongshuSearch

方法: search(keyword: str) -> dict

功能: 搜索指定关键词，返回帖子列表

返回数据处理:

def process_note_data(note: dict) -> Post:
    note_card = note.get("note_card", {})
    return Post(
        note_id=note.get("id"),
        title=note_card.get("display_title"),
        body_text=note_card.get("desc"),
        type=note_card.get("type", "normal"),
        images=[img.get("image_url") for img in note_card.get("image_list", [])],
        interact_info={
            "liked_count": ...,
            "collected_count": ...,
            "comment_count": ...,
            "shared_count": ...
        },
        note_url=f"https://www.xiaohongshu.com/explore/{note.get('id')}"
    )

📝 日志和输出

运行上下文 (run_context.json)

保存内容:

{
  "version": "sug_v6_1_2_8.py",
  "input_files": {...},
  "c": "原始需求",
  "o": "原始问题",
  "log_dir": "...",
  "log_url": "...",
  "rounds": [
    {
      "round_num": 0,
      "type": "initialization",
      "seg_list": [...],
      "word_list_1": [...],
      "q_list_1": [...],
      "seed_list": [...]
    },
    {
      "round_num": 1,
      "input_q_list": [...],
      "sug_count": 20,
      "high_score_sug_count": 5,
      "search_count": 5,
      "total_posts": 50,
      "sug_details": {...},
      "add_word_details": {...},
      "search_results": [...]
    },
    ...
  ],
  "final_output": "..."
}

搜索结果 (search_results.json)

保存内容:

[
  {
    "text": "秋季摄影",
    "score_with_o": 0.92,
    "reason": "...",
    "from_q": {
      "text": "秋季",
      "score_with_o": 0.85
    },
    "post_list": [
      {
        "note_id": "...",
        "note_url": "...",
        "title": "...",
        "body_text": "...",
        "images": [...],
        "interact_info": {...}
      },
      ...
    ]
  },
  ...
]

可视化 HTML

生成方式:

subprocess.run([
    "node",
    "visualization/sug_v6_1_2_8/index.js",
    abs_context_file,
    abs_output_html
])

依赖: Node.js + React + esbuild

生成的文件: log_dir/visualization.html

🚀 使用方法

命令行参数

python3 sug_v6_1_2_8.py \
  --input-dir "input/旅游/如何获取川西秋季风光摄影素材？" \
  --max-rounds 4 \
  --sug-threshold 0.7 \
  --visualize

参数说明:

参数	类型	默认值	说明
`--input-dir`	str	`input/旅游-逸趣玩旅行/...`	输入目录路径
`--max-rounds`	int	4	最大迭代轮数
`--sug-threshold`	float	0.7	建议词评分阈值
`--visualize`	flag	True	是否生成可视化

输入文件结构

input_dir/
├── context.md   # 原始需求描述
└── q.md         # 原始问题

输出文件结构

input_dir/output/sug_v6_1_2_8/{timestamp}/
├── run_context.json      # 运行上下文
├── search_results.json   # 详细搜索结果
└── visualization.html    # 可视化页面

🎨 并发优化

并发点

分词评估: 所有 seg 并发评估
```
await asyncio.gather(*[evaluate_seg(seg) for seg in seg_list])
```
1. Sug 评估: 所有 sug 并发评估 python await asyncio.gather(*[evaluate_sug(sug) for sug in all_sugs])
搜索: 所有高分 sug 并发搜索
```
await asyncio.gather(*[search_for_sug(sug) for sug in high_score_sugs])
```
串行点
1. 分词: 必须先完成分词才能评估
2. 轮次迭代: 必须按顺序执行各轮
3. 加词选择: 每个 seed 的加词必须等待 Agent 返回
🔍 核心特点

1. 迭代扩展
- 从原始问题出发，逐轮扩展搜索词
- 通过 seed + word 组合生成新查询
- 通过 sug 晋升机制引入新的高质量查询
2. 智能评分
- 所有文本与原始问题的相关度都通过 LLM 评估
- 评分结果用于过滤、排序、晋升决策
3. 多 Agent 协作
- 分词专家: 拆分原始问题
- 相关度评估专家: 统一评分标准
- 加词选择专家: 智能组合词汇
4. 数据驱动
- 完整记录每轮的输入输出
- 可追溯每个 query/sug 的来源
- 支持可视化分析
5. 高并发
- 利用 asyncio 实现高并发
- 评估、搜索等操作并行执行
- 提升整体执行效率
🐛 潜在问题和改进方向

1. 词库静态

问题: word_list 在初始化后不再更新，可能错过新的有价值的词

改进方向:
- 从高分 sug 中提取新词加入 word_list
- 从搜索结果的标题/正文中提取关键词
2. 加词盲目性

问题: 每个 seed 每轮必须加一个词，即使候选词质量不高

改进方向:
- 增加加词的评分阈值
- 允许 seed 在某轮跳过加词
3. Sug 重复

问题: 不同 query 可能返回相同的 sug，导致重复搜索

改进方向:
- 全局去重 sug
- 记录已搜索的 query，避免重复搜索
4. 搜索结果未利用

问题: 搜索到的帖子内容没有被进一步分析和利用

改进方向:
- 分析帖子标题/内容提取新的关键词
- 评估帖子质量，作为 query 质量的反馈
5. 阈值固定

问题: sug_threshold 固定，可能导致某些轮次没有搜索结果

改进方向:
- 动态调整阈值
- 保证每轮至少有一定数量的搜索
📈 性能分析

时间复杂度

假设:
- 每轮 q_list 大小: Q
- 每个 q 的 sug 数量: S
- 每轮 seed 数量: K
- 最大轮数: R
每轮时间复杂度:
- 请求 sug: O(Q)（并发）
- 评估 sug: O(Q * S)（并发）
- 搜索: O(高分sug数量)（并发）
- 加词: O(K * word_list大小)（串行，但每个加词操作并发评估）
总时间复杂度: O(R * (Q + Q*S + K*W))

空间复杂度
- seg_list: O(分词数)
- word_list: O(分词数)（当前版本）
- q_list: O(Q) 每轮
- seed_list: O(K) 每轮
- sug_list: O(Q * S) 每轮
- search_list: O(高分sug数) * O(每个搜索的帖子数)
总空间复杂度: O(R * (Q*S + 高分sug数*帖子数))

🎯 总结

sug_v6_1_2_8.py 是一个设计精良的搜索查询优化系统，具有以下特点:

优势
1. ✅ 模块化设计: 数据模型、Agent、流程控制分离清晰
2. ✅ 智能化: 利用多个 LLM Agent 实现分词、评估、选词
3. ✅ 可扩展: 通过迭代机制不断扩展搜索范围
4. ✅ 高性能: 大量使用并发优化执行效率
5. ✅ 可追溯: 完整记录每轮数据，支持可视化分析
核心流程

```

原始问题 → 分词 → 评估 → 迭代(请求sug → 评估 → 搜索 → 加词 → 更新) → 输出结果 ```

关键机制

评分机制: 统一的相关度评估标准
晋升机制: 高分 sug 晋升为 query 和 seed
扩展机制: seed + word 组合生成新 query
过滤机制: 阈值过滤低质量 sug

适用场景

搜索查询扩展和优化
关键词发现和探索
内容检索和推荐
搜索效果分析

文档生成时间: 2025-11-03 代码版本: sug_v6_1_2_8.py 作者: Knowledge Agent Team

sug_v6_1_2_8_流程分析.md 24 KB História Raw

sug_v6_1_2_8.py 流程分析文档

📋 概述

🏗️ 整体架构

架构图

核心组件

📦 数据模型

核心数据结构

1. Seg (分词)

2. Word (词)

3. Q (查询)

4. Sug (建议词)

5. Seed (种子)

6. Post (帖子)

7. Search (搜索结果)

8. RunContext (运行上下文)

🤖 Agent 系统

Agent 1: 分词专家 (word_segmenter)

Agent 2: 相关度评估专家 (relevance_evaluator)

Agent 3: 加词选择专家 (word_selector)

🔄 核心流程

阶段 0: 初始化 (initialize)

阶段 N: 轮次迭代 (run_round)

步骤1: 请求建议词

步骤2: 评估建议词

步骤3: 构建 search_list（搜索高分建议词）

步骤4: 构建 word_list_next

步骤5: 构建 q_list_next

步骤6: 更新 seed_list

主循环 (iterative_loop)

📊 数据流图

完整数据流

每轮数据变化

🎯 关键算法

1. 相关度评分机制

2. 加词策略

3. Sug 晋升机制

4. 搜索阈值过滤

🔧 外部服务集成

1. 小红书搜索推荐 API

2. 小红书搜索 API

📝 日志和输出

运行上下文 (run_context.json)

搜索结果 (search_results.json)

可视化 HTML

🚀 使用方法

命令行参数

输入文件结构

输出文件结构

🎨 并发优化

并发点

串行点

🔍 核心特点

1. 迭代扩展

2. 智能评分

3. 多 Agent 协作

4. 数据驱动

5. 高并发

🐛 潜在问题和改进方向

1. 词库静态

2. 加词盲目性

3. Sug 重复

4. 搜索结果未利用

5. 阈值固定

📈 性能分析

时间复杂度

空间复杂度

🎯 总结

优势

核心流程

关键机制

适用场景

sug_v6_1_2_8_流程分析.md 24 KB

História Raw