howard
запушил(а) main в howard/Agent
bc4ddf6f5c chore: untrack runtime artifacts committed by mistake
Remove 28 files that were committed but should have never been in git.
All are runtime-generated state, not source, and some contain PII.
Categories:
- outputs/ (20 image files): toolhub CLI generation results
(flux/nano_banana/seedream). Caller-side scratch, created by
`toolhub.py call`. Already gitignored in an earlier commit but
the files were added by a parallel branch before the ignore rule
landed.
- agent/tools/builtin/feishu/chat_history/ (4 files including 3 with
real contact names in the filename and chat_summary.json): runtime-
maintained per-contact chat logs. FEISHU_TOOLS_PROMPT.md explicitly
documents this directory as "系统会自动维护的聊天记录文件". Containing
real coworker names, these should never have been committed.
- frontend/htmlTemplate/api_data/ (2 files) and ws_data/ (2 files):
runtime-cached trace/goal snapshots from the backend API and
WebSocket event stream. Written by templateData.py at runtime via
save_ws_data_to_file / _append_event_jsonl. Variable name is
`mock_dir` — they were meant to be ephemeral mock caches, not seed.
.gitignore updates:
- Add frontend/htmlTemplate/api_data/ and ws_data/ (the existing
`frontend/htmlTemplate/mock_data` rule is a dead entry — that
directory no longer exists, the real dirs use api_data/ws_data
names now; leaving the mock_data rule in place for safety)
- Add agent/tools/builtin/feishu/chat_history/ with a comment
noting the PII concern
- outputs/ was already gitignored in d269588
Local files are preserved (git rm --cached, not git rm), so running
services can keep reading/writing them as usual — they're just no
longer tracked.
d269588963 chore: gitignore /outputs dir from toolhub CLI test runs
The toolhub CLI (`toolhub.py call`) writes generated images to
outputs/<trace_id>/, which is test/session scratch. Never commit.
e940602280 refactor(knowhub): rename serialize_milvus_result to to_serializable
The function has nothing to do with Milvus — it is a generic Python
object -> JSON-safe dict serializer that handles dicts, lists, iterables,
objects with to_dict(), and fallback __dict__ walking. The name was a
historical artifact from when the knowledge store actually used Milvus
(now removed from the dependency set entirely).
Rename to to_serializable across all 12 call sites in knowhub/server.py
plus the definition. Also update the docstring to reflect the real
purpose ("通用序列化工具:把任意 Python 对象转换为 JSON 可序列化的原生类型").
a914ceea15 docs(tools): cross-framework usage guide + refactor plan
tools.md additions:
- "跨框架使用(CLI/MCP)" section: design philosophy (stateless -> CLI,
stateful -> MCP), CLI entry conventions, trace_id fallback pattern,
double-JSON encoding avoidance, MCP integration via .mcp.json (not
settings.json — Claude Code doesn't read mcpServers from there)
- New entries in the builtin tool table: read_images, toolhub_*, ask/
upload_knowledge
- read_file vs read_images usage guidance with adaptive-layout table
- Skill installation convention (~/.claude/skills/<name>/SKILL.md) and
the size distinction: SKILL.md is runtime-loaded, keep short; tools.md
is for developers, can go long
tools-refactor-plan.md (new):
- Captures the discovery-pattern philosophy and per-family migration
plans for two upcoming refactors that will happen in a later session:
1. Content tools (search_posts / youtube_search / x_search) — merge
into content_platforms() + content_search() + content_detail() in
the same spirit as toolhub_search + toolhub_call
2. Browser tools — collapse 28 @tool functions into ~11 verb-based
tools using Literal enum actions. Browser is NOT a good fit for
dynamic discovery since the differences are in parameters, not in
capabilities
- Explicitly rejects "discovery-based browser tools" and "full MCP
client wrapper" paths, with reasoning
- Lists all open design questions that must be decided before
implementation starts
efea909f3b feat(read_images): batch image tool with adaptive grid + shared image utils
Problem: when the agent needs to analyze many local images (pick the best
photo, compare candidates, batch-judge), reading them one at a time via
read_file blows up tokens — each image carries structural overhead per
message block, and there is no way to see them side-by-side for
comparison.
read_images solves this:
- Loads 1-12 images concurrently (local paths or URLs, mixable)
- Downscales every image to max_dimension (default 1024px) to control
per-image token cost
- Two layouts:
- grid (default): stitches N images into one index-numbered (1,2,3...)
grid image so the LLM sees one picture with all candidates and can
refer to them by index. Auto-picks columns/thumb_size based on count
(2 imgs -> 2x1 @500px, 12 imgs -> 4x3 @320px), so final canvas stays
within ~1400px long edge and no per-cell cell gets too small to read
after LLM-internal resize
- separate: returns N independent downscaled images for tools that
really need per-image attention
- Output text maps index -> full original path so the LLM can reference
"image 3" and resolve it to the source file for downstream edits
Grid mode caps at 12 images per call. Beyond that, each cell becomes too
small to be useful after the LLM's internal image resize (~1568px long
edge). Caller must batch in chunks.
Shared utilities (agent/tools/utils/image.py):
- load_image / load_images: async local+URL loader
- downscale: aspect-preserving resize
- build_image_grid: parameterized grid builder with scaled index boxes
(index_box = thumb_size // 5, font = box * 0.65, so visual proportions
stay constant across different thumb_size)
- encode_base64: PIL -> base64 JPEG for tool result images
Fixes a latent font bug at the same time: PingFang.ttc on macOS Sequoia
cannot be opened by PIL/FreeType (cryptic "cannot open resource"), so
search.py and crawler.py were silently rendering collages with the tiny
default bitmap font — Chinese titles showed as near-invisible dots. The
new font candidate list prioritizes Hiragino Sans GB and STHeiti Medium,
both of which PIL can actually read.
Refactor search.py and crawler.py to call build_image_grid instead of
maintaining their own ~120-line duplicate collage implementations. No
behavior change besides the font fix.
read_file.py: add a docstring note pointing at read_images for batch use
so the LLM can pick the right tool.
- Просмотр сравнение для этих 7 коммитов »
2 месяцев назад