本 API 为外部 agent 提供三个核心能力,用于与微调模型评估平台交互。
基础 URL: http://localhost:8100/api/agent
接口: GET /api/agent/models
说明: 获取已部署和可部署的模型列表,包含模型元数据。
返回示例:
{
"deployed_models": [
{
"model_id": "全品类_选题_dense",
"model_name": "31品类-QWEN32B-v2-全参数微调",
"base_model": "qwen32b",
"status": "deployed",
"port": 8101,
"metadata": {
"data_source": "31个数据集",
"training_method": "full",
"usage": "使用QWEN32B模型在31个品类数据集上进行全参数微调"
}
}
],
"available_models": [
{
"model_id": "全品类_选题_moe",
"model_name": "31品类-QWEN30B-MoE",
"base_model": "qwen30b",
"status": "available",
"metadata": {
"data_source": "31个数据集",
"training_method": "lora",
"usage": "MoE架构微调模型"
}
}
]
}
使用示例:
curl http://localhost:8100/api/agent/models
接口: POST /api/agent/deploy
说明: 请求部署指定模型,系统会自动分配 GPU 和端口。
请求参数:
{
"model_id": "全品类_选题_dense",
"epoch": 3 // 可选:指定 checkpoint epoch
}
返回示例:
{
"deployment_id": "deploy_abc123def456",
"message": "部署已启动"
}
注意事项:
epoch,将使用最新的 checkpoint使用示例:
curl -X POST http://localhost:8100/api/agent/deploy \
-H "Content-Type: application/json" \
-d '{"model_id": "全品类_选题_dense", "epoch": 3}'
接口: POST /api/agent/inference
说明: 向已部署的模型发送查询并获取响应。
请求参数:
{
"model_id": "全品类_选题_dense",
"query": "写一篇关于美食的小红书笔记",
"temperature": 0.7 // 可选:0.0-1.0,默认 0.7
}
返回示例:
{
"response": "今天去了一家超棒的川菜馆...",
"tokens_generated": 156
}
注意事项:
使用示例:
curl -X POST http://localhost:8100/api/agent/inference \
-H "Content-Type: application/json" \
-d '{
"model_id": "全品类_选题_dense",
"query": "写一篇关于旅行的笔记",
"temperature": 0.8
}'
所有接口返回标准 HTTP 状态码:
200: 成功400: 请求错误(无效的 model_id、缺少参数等)404: 模型未找到或未部署500: 服务器内部错误错误响应格式:
{
"detail": "错误信息描述"
}
# 1. 查询可用模型
curl http://localhost:8100/api/agent/models
# 2. 部署模型
curl -X POST http://localhost:8100/api/agent/deploy \
-H "Content-Type: application/json" \
-d '{"model_id": "全品类_选题_dense"}'
# 3. 等待部署完成(检查状态)
curl http://localhost:8100/api/agent/models
# 4. 执行推理
curl -X POST http://localhost:8100/api/agent/inference \
-H "Content-Type: application/json" \
-d '{"model_id": "全品类_选题_dense", "query": "写美食笔记"}'
将这些接口封装为 MCP tools:
model_id(必需)、epoch(可选)model_id(必需)、query(必需)、temperature(可选)import requests
BASE_URL = "http://localhost:8100/api/agent"
def list_models():
"""查询模型列表"""
response = requests.get(f"{BASE_URL}/models")
return response.json()
def deploy_model(model_id: str, epoch: int = None):
"""部署模型"""
data = {"model_id": model_id}
if epoch:
data["epoch"] = epoch
response = requests.post(f"{BASE_URL}/deploy", json=data)
return response.json()
def inference(model_id: str, query: str, temperature: float = 0.7):
"""模型推理"""
data = {
"model_id": model_id,
"query": query,
"temperature": temperature
}
response = requests.post(f"{BASE_URL}/inference", json=data)
return response.json()