# LongArticleTaskServer description: a server for long_articles project experiments and tasks ### 启动服务 #### use hypercorn ```aiignore hypercorn task_app:app --config app_config.toml ``` #### use docker ``` docker compose up -d ``` ### 项目结构 ``` . ├── Dockerfile ├── LICENSE ├── README.md ├── app_config.toml ├── applications │   ├── __init__.py │   ├── ab_test │   │   ├── __init__.py │   │   └── get_cover.py │   ├── api │   │   ├── __init__.py │   │   ├── aliyun_log_api.py │   │   ├── async_aigc_system_api.py │   │   ├── async_apollo_api.py │   │   ├── async_feishu_api.py │   │   ├── async_piaoquan_api.py │   │   ├── deep_seek_official_api.py │   │   └── elastic_search_api.py │   ├── config │   │   ├── __init__.py │   │   ├── aliyun_log_config.py │   │   ├── deepseek_config.py │   │   ├── elastic_search_mappings.py │   │   ├── es_certs.crt │   │   └── mysql_config.py │   ├── crawler │   │   ├── toutiao │   │   │   ├── __init__.py │   │   │   ├── blogger.py │   │   │   ├── detail_recommend.py │   │   │   ├── main_page_recomend.py │   │   │   ├── search.py │   │   │   ├── toutiao.js │   │   │   └── use_js.py │   │   └── wechat │   │   ├── __init__.py │   │   └── gzh_spider.py │   ├── database │   │   ├── __init__.py │   │   └── mysql_pools.py │   ├── pipeline │   │   ├── __init__.py │   │   ├── crawler_pipeline.py │   │   └── data_recycle_pipeline.py │   ├── service │   │   ├── __init__.py │   │   └── log_service.py │   ├── tasks │   │   ├── __init__.py │   │   ├── cold_start_tasks │   │   │   ├── __init__.py │   │   │   └── article_pool_cold_start.py │   │   ├── crawler_tasks │   │   │   ├── __init__.py │   │   │   └── crawler_toutiao.py │   │   ├── data_recycle_tasks │   │   │   ├── __init__.py │   │   │   └── recycle_daily_publish_articles.py │   │   ├── llm_tasks │   │   │   ├── __init__.py │   │   │   ├── candidate_account_process.py │   │   │   └── process_title.py │   │   ├── monitor_tasks │   │   │   ├── __init__.py │   │   │   ├── get_off_videos.py │   │   │   ├── gzh_article_monitor.py │   │   │   ├── kimi_balance.py │   │   │   └── task_processing_monitor.py │   │   ├── task_mapper.py │   │   ├── task_scheduler.py │   │   └── task_scheduler_v2.py │   └── utils │   ├── __init__.py │   ├── aigc_system_database.py │   ├── async_apollo_client.py │   ├── async_http_client.py │   ├── async_mysql_utils.py │   ├── common.py │   ├── get_cover.py │   ├── item.py │   └── response.py ├── dev │   ├── code.py │   ├── dev.py │   ├── run_task_dev.py │   ├── sample.txt │   ├── title.json │   └── totp.py ├── dev.py ├── docker-compose.yaml ├── myapp.log ├── requirements.txt ├── routes │   ├── __init__.py │   └── blueprint.py └── task_app.py ``` ### get code strategy ``` tree -I "__pycache__|*.pyc" ``` ## 1. 数据任务 #### daily发文数据回收 ```aiignore curl -X POST http://192.168.142.66:6060/api/run_task -H "Content-Type: application/json" -d '{"task_name": "daily_publish_articles_recycle"}' ``` #### daily发文更新root_source_id ```aiignore curl -X POST http://192.168.142.66:6060/api/run_task -H "Content-Type: application/json" -d '{"task_name": "update_root_source_id"}' ``` #### 账号质量处理 ```aiignore curl -X POST http://192.168.142.66:6060/api/run_task -H "Content-Type: application/json" -d '{"task_name": "candidate_account_quality_analysis"}' ``` ## 2. 抓取任务 #### 今日头条账号内文章抓取 ```aiignore curl -X POST http://192.168.142.66:6060/api/run_task -H "Content-Type: application/json" -d '{"task_name": "crawler_toutiao"}' ``` #### 今日头条推荐抓取文章 ```aiignore curl -X POST http://192.168.142.66:6060/api/run_task -H "Content-Type: application/json" -d '{"task_name": "crawler_toutiao", "method": "recommend"}' ``` #### 今日头条搜索抓取账号 ```aiignore curl -X POST http://192.168.142.66:6060/api/run_task -H "Content-Type: application/json" -d '{"task_name": "crawler_toutiao", "method": "search"}' ``` #### 抓取账号管理(微信) ```aiignore curl -X POST http://192.168.142.66:6060/api/run_task -H "Content-Type: application/json" -d '{"task_name": "crawler_account_manager", "platform": "weixin"}' ``` #### 抓取微信文章(抓账号模式) ```aiignore curl -X POST http://192.168.142.66:6060/api/run_task -H "Content-Type: application/json" -d '{"task_name": "crawler_gzh_articles", "account_method": "account_association", "crawl_mode": "account"}' ``` #### 抓取微信文章(搜索模式) ```aiignore curl -X POST http://192.168.142.66:6060/api/run_task -H "Content-Type: application/json" -d '{"task_name": "crawler_gzh_articles", "account_method": "search", "crawl_mode": "search"}' ``` ## 3. 冷启动发布任务 #### 发布头条文章 ```aiignore curl -X POST http://192.168.142.66:6060/api/run_task -H "Content-Type: application/json" -d '{"task_name": "article_pool_cold_start", "platform": "toutiao", "crawler_methods": ["toutiao_account_association"]}' ``` #### 发布公众号文章 ```aiignore curl -X POST http://192.168.142.66:6060/api/run_task -H "Content-Type: application/json" -d '{"task_name": "article_pool_cold_start"}' ``` ## 4. 其他 #### 校验kimi余额 ```aiignore curl -X POST http://192.168.142.66:6060/api/run_task -H "Content-Type: application/json" -d '{"task_name": "check_kimi_balance"}' ``` #### 自动下架视频 ```aiignore curl -X POST http://192.168.142.66:6060/api/run_task -H "Content-Type: application/json" -d '{"task_name": "get_off_videos"}' ``` #### 校验视频可见状态 ```aiignore curl -X POST http://192.168.142.66:6060/api/run_task -H "Content-Type: application/json" -d '{"task_name": "check_publish_video_audit_status"}' ``` #### 外部服务号监测 ```aiignore curl -X POST http://192.168.142.66:6060/api/run_task -H "Content-Type: application/json" -d '{"task_name": "outside_article_monitor"}' ``` #### 站内服务号发文监测 ```aiignore curl -X POST http://192.168.142.66:6060/api/run_task -H "Content-Type: application/json" -d '{"task_name": "inner_article_monitor"}' ``` #### 标题重写 ```aiignore curl -X POST http://192.168.142.66:6060/api/run_task -H "Content-Type: application/json" -d '{"task_name": "title_rewrite"}' ``` #### 为标题增加品类(文章池) ```aiignore curl -X POST http://192.168.142.66:6060/api/run_task -H "Content-Type: application/json" -d '{"task_name": "article_pool_category_generation", "limit": "1000"}' ``` #### 候选账号质量分析 ```aiignore curl -X POST http://192.168.142.66:6060/api/run_task -H "Content-Type: application/json" -d '{"task_name": "candidate_account_quality_analysis"}' ```