数据工程追踪aigc模块各业务域

SamLee e6d52b9bab Add data engineering methodology structure 4 gün önce
archive e6d52b9bab Add data engineering methodology structure 4 gün önce
domains e6d52b9bab Add data engineering methodology structure 4 gün önce
integrated e6d52b9bab Add data engineering methodology structure 4 gün önce
schemas e6d52b9bab Add data engineering methodology structure 4 gün önce
skills e6d52b9bab Add data engineering methodology structure 4 gün önce
templates e6d52b9bab Add data engineering methodology structure 4 gün önce
.~data_clustering_statistic.xlsx b886a87341 Add data engineering analysis artifacts 4 gün önce
.~demand_engineering_statistic.xlsx b886a87341 Add data engineering analysis artifacts 4 gün önce
AGENTS.md e6d52b9bab Add data engineering methodology structure 4 gün önce
BuildingSystemAgent业务与技术逻辑全链路分析.md b886a87341 Add data engineering analysis artifacts 4 gün önce
README.md e6d52b9bab Add data engineering methodology structure 4 gün önce
data-clustering业务理解.md b886a87341 Add data engineering analysis artifacts 4 gün önce
data_clustering_statistic.xlsx b886a87341 Add data engineering analysis artifacts 4 gün önce
data_full_statistic.xlsx b886a87341 Add data engineering analysis artifacts 4 gün önce
demand_engineering_statistic.xlsx b886a87341 Add data engineering analysis artifacts 4 gün önce
demandagent业务理解.md b886a87341 Add data engineering analysis artifacts 4 gün önce
img-article-compreh-dataflow.md b886a87341 Add data engineering analysis artifacts 4 gün önce
pattern-global-v2-full-data-algorithm-report.md b886a87341 Add data engineering analysis artifacts 4 gün önce

README.md

data-engineering

数据工程追踪 AIGC 模块各业务域。

这个仓库用于沉淀多业务域的数据工程分析、数据血缘追踪、统一 Excel/JSON 格式和可视化方法论。当前重点业务域包括聚类管线、Pattern V2、Demand Agent、外部数据源、BuildingSystemAgent,以及后续 Search Agent。

Directory Map

  • AGENTS.md:给 Codex / sub-agent 的项目总规则和默认工作流。
  • skills/:沉淀 7 个可复用分析 Skill。
  • templates/:新业务域分析时复制使用的 Markdown / JSON 模板。
  • schemas/:统一 lineage JSON / Excel 字段规范。
  • domains/:每个业务域一个文件夹,放业务拆解、技术拆解、血缘、核验和该域数据文件。
  • integrated/:跨业务域总图、总 Excel、总 JSON。
  • archive/:历史版本、旧口径、一次性中间材料。

Domain Workflow

每个业务域推荐按这个顺序沉淀:

  1. 01-recon-notes.md:侦察备忘录,记录入口、疑似表、疑似产物和不确定点。
  2. 02-business-logic.md:业务逻辑拆解,说明业务对象如何变成业务结果。
  3. 03-technical-implementation.md:技术实现拆解,说明入口、代码、组件、算法和状态控制。
  4. 04-data-lineage.md:数据血缘拆解,说明数据从哪里来、怎么加工、写到哪里。
  5. 05-verification.md:核验记录,说明代码、DB、样例、风险和漂移。
  6. lineage.snapshot.json:未来给可视化和自动化消费的统一结构化产物。