| 123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151 |
- """
- 从 what 解构结果中提取灵感点列表
- 读取指定文件夹中的所有 JSON 文件,提取灵感点,保存到同级目录
- """
- import json
- import os
- from pathlib import Path
- from typing import List
- from lib.utils import read_json
- def extract_inspirations_from_file(file_path: str) -> List[dict]:
- """从单个 what 解构文件中提取所有灵感点
- Args:
- file_path: JSON 文件路径
- Returns:
- 灵感点列表,每个元素包含 灵感点 和 meta 字段
- """
- # 从文件名提取 note_id(第一个下划线之前的部分)
- filename = os.path.basename(file_path)
- note_id = filename.split('_')[0]
- try:
- data = read_json(file_path)
- except Exception as e:
- print(f"⚠️ 读取文件失败: {file_path} - {e}")
- return []
- inspirations = []
- # 提取灵感点
- san_dian = data.get("三点解构", {})
- ling_gan_dian = san_dian.get("灵感点", {})
- # 三个类别:全新内容、共性差异、共性内容
- for category in ["全新内容", "共性差异", "共性内容"]:
- items = ling_gan_dian.get(category, [])
- for item in items:
- inspiration_text = item.get("灵感点", "")
- if inspiration_text:
- # 构建 meta 字段:原有字段 + note_id + category + what文件路径,但排除"灵感点"字段
- meta = {k: v for k, v in item.items() if k != "灵感点"}
- meta["note_id"] = note_id
- meta["category"] = category
- meta["what_file"] = file_path
- inspirations.append({
- "灵感点": inspiration_text,
- "meta": meta
- })
- return inspirations
- def extract_inspirations_from_folder(folder_path: str) -> List[dict]:
- """从文件夹中提取所有灵感点
- Args:
- folder_path: what 解构结果文件夹路径
- Returns:
- 灵感点列表(保留所有,不去重)
- """
- folder = Path(folder_path)
- if not folder.exists():
- raise FileNotFoundError(f"文件夹不存在: {folder_path}")
- # 收集所有 JSON 文件
- json_files = sorted(list(folder.glob("*.json")))
- print(f"\n找到 {len(json_files)} 个 JSON 文件")
- # 提取所有灵感点
- all_inspirations = []
- for json_file in json_files:
- inspirations = extract_inspirations_from_file(str(json_file))
- all_inspirations.extend(inspirations)
- if inspirations:
- print(f" ✓ {json_file.name}: {len(inspirations)} 个灵感点")
- print(f"\n总计提取: {len(all_inspirations)} 个灵感点")
- return all_inspirations
- def save_inspirations(inspirations: List[dict], output_dir: str):
- """保存灵感点列表(输出两个文件)
- Args:
- inspirations: 灵感点列表(包含 灵感点 和 meta 字段)
- output_dir: 输出目录
- """
- # 1. 保存详细版本(包含 meta 信息)
- detailed_file = os.path.join(output_dir, "灵感点_详细.json")
- with open(detailed_file, 'w', encoding='utf-8') as f:
- json.dump(inspirations, f, ensure_ascii=False, indent=2)
- print(f"\n✓ 详细灵感点列表已保存到: {detailed_file}")
- # 2. 保存简化版本(仅灵感点名称列表)
- simple_list = [item["灵感点"] for item in inspirations]
- simple_file = os.path.join(output_dir, "灵感点.json")
- with open(simple_file, 'w', encoding='utf-8') as f:
- json.dump(simple_list, f, ensure_ascii=False, indent=2)
- print(f"✓ 简化灵感点列表已保存到: {simple_file}")
- def main():
- """主函数"""
- import sys
- # 命令行参数:what 解构结果文件夹路径
- if len(sys.argv) > 1:
- what_folder = sys.argv[1]
- else:
- what_folder = "data/阿里多多酱/out/人设_1110/what解构结果"
- print(f"{'=' * 80}")
- print(f"从 what 解构结果中提取灵感点")
- print(f"{'=' * 80}")
- print(f"输入文件夹: {what_folder}")
- try:
- # 提取灵感点
- inspirations = extract_inspirations_from_folder(what_folder)
- # 确定输出目录(输入文件夹的父目录,即同级目录)
- what_folder_path = Path(what_folder)
- output_dir = what_folder_path.parent # data/阿里多多酱/out/人设_v2
- # 保存结果
- save_inspirations(inspirations, str(output_dir))
- # 显示前10个灵感点
- print(f"\n{'=' * 80}")
- print(f"灵感点预览(前10个):")
- print(f"{'=' * 80}")
- for i, item in enumerate(inspirations[:10], 1):
- print(f"{i}. {item['灵感点']}")
- if len(inspirations) > 10:
- print(f"... 还有 {len(inspirations) - 10} 个")
- except Exception as e:
- print(f"\n❌ 错误: {e}")
- import traceback
- traceback.print_exc()
- if __name__ == "__main__":
- main()
|