# Browser-Use 云浏览器模式使用指南

## 目录
- [简介](#简介)
- [云浏览器 vs 本地浏览器](#云浏览器-vs-本地浏览器)
- [环境配置](#环境配置)
- [快速开始](#快速开始)
- [核心概念](#核心概念)
- [示例代码](#示例代码)
- [高级用法](#高级用法)
- [常见问题](#常见问题)
- [最佳实践](#最佳实践)

---

## 简介

Browser-Use 云浏览器模式允许你在云端运行浏览器自动化任务，无需在本地安装 Chrome/Chromium。这对于以下场景特别有用：

- 🚀 **无头服务器部署** - 在没有图形界面的服务器上运行
- 🌍 **分布式爬虫** - 轻松扩展到多个云浏览器实例
- 💻 **跨平台一致性** - 避免本地环境差异
- 🔒 **安全隔离** - 浏览器运行在隔离的云环境中
- 📊 **资源优化** - 不占用本地计算资源

---

## 云浏览器 vs 本地浏览器

| 特性 | 云浏览器 | 本地浏览器 |
|------|---------|-----------|
| **安装要求** | 无需安装 Chrome | 需要安装 Chrome/Chromium |
| **运行环境** | 云端 | 本地机器 |
| **资源占用** | 不占用本地资源 | 占用本地 CPU/内存 |
| **网络延迟** | 可能有轻微延迟 | 无网络延迟 |
| **成本** | 需要 API 配额 | 免费 |
| **调试** | 提供 Live URL 实时查看 | 可以直接看到浏览器窗口 |
| **适用场景** | 服务器部署、分布式任务 | 本地开发、调试 |

---

## 环境配置

### 1. 安装依赖

```bash
# 安装 browser-use
pip install browser-use

# 安装云浏览器所需的额外依赖
pip install python-socks
```

### 2. 获取 API Key

1. 访问 [Browser-Use 官网](https://browser-use.com)
2. 注册账号并获取 API Key
3. 将 API Key 添加到 `.env` 文件

### 3. 配置环境变量

在项目根目录的 `.env` 文件中添加：

```bash
# Browser-Use 云浏览器 API Key
BROWSER_USE_API_KEY=your_api_key_here

# 可选：如果需要使用 LLM 功能
GOOGLE_API_KEY=your_google_api_key
GEMINI_API_KEY=your_gemini_api_key
```

---

## 快速开始

### 最简单的云浏览器示例

```python
import asyncio
import os
from dotenv import load_dotenv
from agent.tools.builtin.baseClass import (
    init_browser_session,
    cleanup_browser_session,
    navigate_to_url,
)

# 加载环境变量
load_dotenv()

async def main():
    # 初始化云浏览器（关键：use_cloud=True）
    browser, tools = await init_browser_session(
        headless=True,
        use_cloud=True,  # 启用云浏览器
    )

    print("✅ 云浏览器已启动")

    # 访问网页
    result = await navigate_to_url("https://www.baidu.com")
    print(f"导航结果: {result.title}")

    # 清理
    await cleanup_browser_session()
    print("🧹 浏览器已关闭")

if __name__ == "__main__":
    asyncio.run(main())
```

### 运行示例

```bash
# 运行默认示例（示例 1）
python examples/cloud_browser_example.py

# 运行指定示例
python examples/cloud_browser_example.py --example 2

# 运行所有示例
python examples/cloud_browser_example.py --all
```

---

## 核心概念

### 1. 初始化云浏览器会话

```python
from agent.tools.builtin.baseClass import init_browser_session

# 云浏览器模式
browser, tools = await init_browser_session(
    headless=True,      # 云浏览器通常使用无头模式
    use_cloud=True,     # 关键参数：启用云浏览器
)
```

**参数说明：**
- `headless`: 是否使用无头模式（云浏览器推荐 True）
- `use_cloud`: 是否使用云浏览器（True=云浏览器，False=本地浏览器）
- `browser_profile`: 可选，预设 cookies、localStorage 等
- `**kwargs`: 其他 BrowserSession 参数

### 2. 使用 BrowserProfile 预设配置

```python
from browser_use import BrowserProfile

# 创建配置文件
profile = BrowserProfile(
    # 预设 cookies
    cookies=[
        {
            "name": "session_id",
            "value": "abc123",
            "domain": ".example.com",
            "path": "/",
        }
    ],
    # 预设 localStorage
    local_storage={
        "example.com": {
            "token": "your_token",
            "user_id": "12345",
        }
    },
    # 自定义 User-Agent
    user_agent="Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7)",
)

# 使用配置初始化浏览器
browser, tools = await init_browser_session(
    use_cloud=True,
    browser_profile=profile,
)
```

### 3. 可用的工具函数

项目提供了丰富的浏览器操作工具，所有工具都支持云浏览器：

#### 导航类工具
```python
# 导航到 URL
await navigate_to_url("https://example.com")

# 在新标签页打开
await navigate_to_url("https://example.com", new_tab=True)

# 搜索
await search_web("Python async", engine="google")

# 返回上一页
await go_back()

# 等待
await wait(seconds=3)
```

#### 元素交互工具
```python
# 点击元素（需要先获取元素索引）
await click_element(index=5)

# 输入文本
await input_text(index=0, text="Hello World", clear=True)

# 发送按键
await send_keys("Enter")
await send_keys("Control+A")

# 上传文件
await upload_file(index=7, path="/path/to/file.pdf")
```

#### 页面操作工具
```python
# 滚动页面
await scroll_page(down=True, pages=2.0)

# 查找文本
await find_text("Privacy Policy")

# 截图
await screenshot()

# 获取页面 HTML
html_result = await get_page_html()

# 获取可交互元素
selector_result = await get_selector_map()

# 执行 JavaScript
result = await evaluate("document.title")
```

#### 标签页管理
```python
# 切换标签页
await switch_tab(tab_id="a3f2")

# 关闭标签页
await close_tab(tab_id="a3f2")
```

#### 文件操作
```python
# 写入文件
await write_file("output.txt", "Hello World")

# 读取文件
content = await read_file("input.txt")

# 替换文件内容
await replace_file("config.txt", "old_value", "new_value")
```

---

## 示例代码

### 示例 1: 基础导航操作

```python
async def example_basic_navigation():
    """访问网页并获取页面信息"""
    browser, tools = await init_browser_session(use_cloud=True)

    # 导航到百度
    await navigate_to_url("https://www.baidu.com")
    await wait(2)

    # 获取页面标题
    title_result = await evaluate("document.title")
    print(f"页面标题: {title_result.output}")

    # 截图
    await screenshot()

    await cleanup_browser_session()
```

### 示例 2: 搜索和内容提取

```python
async def example_search():
    """使用搜索引擎并提取内容"""
    browser, tools = await init_browser_session(use_cloud=True)

    # 搜索
    await search_web("Python async programming", engine="google")
    await wait(3)

    # 获取页面 HTML
    html_result = await get_page_html()
    print(f"HTML 长度: {len(html_result.metadata.get('html', ''))} 字符")

    # 获取可交互元素
    selector_result = await get_selector_map()
    print(selector_result.output)

    await cleanup_browser_session()
```

### 示例 3: 使用 BrowserProfile

```python
async def example_with_profile():
    """使用预设配置"""
    from browser_use import BrowserProfile

    # 创建配置
    profile = BrowserProfile(
        cookies=[{
            "name": "test_cookie",
            "value": "test_value",
            "domain": ".example.com",
            "path": "/",
        }],
        user_agent="Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7)",
    )

    # 使用配置初始化
    browser, tools = await init_browser_session(
        use_cloud=True,
        browser_profile=profile,
    )

    # 访问网页
    await navigate_to_url("https://httpbin.org/headers")
    await wait(2)

    # 检查 User-Agent
    ua_result = await evaluate("navigator.userAgent")
    print(f"User-Agent: {ua_result.output}")

    await cleanup_browser_session()
```

### 示例 4: 表单交互

```python
async def example_form_interaction():
    """填写表单"""
    browser, tools = await init_browser_session(use_cloud=True)

    # 访问表单页面
    await navigate_to_url("https://httpbin.org/forms/post")
    await wait(2)

    # 获取页面元素
    selector_result = await get_selector_map()
    print(f"找到 {selector_result.long_term_memory}")

    # 根据实际页面结构填写表单
    # await input_text(index=0, text="用户名")
    # await input_text(index=1, text="密码")
    # await click_element(index=2)  # 提交按钮

    await cleanup_browser_session()
```

### 示例 5: 多标签页操作

```python
async def example_multi_tab():
    """管理多个标签页"""
    browser, tools = await init_browser_session(use_cloud=True)

    # 第一个标签页
    await navigate_to_url("https://www.baidu.com")
    await wait(2)

    # 新标签页
    await navigate_to_url("https://www.google.com", new_tab=True)
    await wait(2)

    # 获取当前页面信息
    title_result = await evaluate("document.title")
    print(f"当前标题: {title_result.output}")

    await cleanup_browser_session()
```

---

## 高级用法

### 1. 实时查看云浏览器

云浏览器启动时会输出一个 Live URL，你可以在浏览器中打开这个 URL 实时查看云浏览器的操作：

```
INFO [cloud] 🔗 Live URL: https://live.browser-use.com?wss=https%3A%2F%2F...
```

复制这个 URL 到浏览器中打开，即可实时查看云浏览器的操作。

### 2. 错误处理

```python
async def example_with_error_handling():
    browser = None
    try:
        browser, tools = await init_browser_session(use_cloud=True)

        result = await navigate_to_url("https://example.com")
        if result.error:
            print(f"导航失败: {result.error}")
            return

        # 其他操作...

    except Exception as e:
        print(f"发生错误: {str(e)}")
    finally:
        if browser:
            await cleanup_browser_session()
```

### 3. 会话复用

```python
# 全局会话会自动复用
# 第一次调用会创建新会话
browser1, tools1 = await init_browser_session(use_cloud=True)

# 后续调用会返回同一个会话
browser2, tools2 = await init_browser_session(use_cloud=True)

# browser1 和 browser2 是同一个对象
assert browser1 is browser2
```

### 4. 强制终止浏览器

```python
from agent.tools.builtin.baseClass import kill_browser_session

# 优雅关闭（推荐）
await cleanup_browser_session()

# 强制终止（用于异常情况）
await kill_browser_session()
```

---

## 常见问题

### Q1: 云浏览器启动失败

**问题：** `python-socks is required to use a SOCKS proxy`

**解决：**
```bash
pip install python-socks
```

### Q2: API Key 无效

**问题：** `未找到 BROWSER_USE_API_KEY`

**解决：**
1. 确保 `.env` 文件在项目根目录
2. 确保 API Key 格式正确
3. 确保代码中调用了 `load_dotenv()`

### Q3: 云浏览器连接超时

**问题：** 云浏览器启动后无法连接

**解决：**
1. 检查网络连接
2. 检查防火墙设置
3. 尝试使用代理

### Q4: 如何切换回本地浏览器

**解决：**
```python
# 使用本地浏览器
browser, tools = await init_browser_session(
    use_cloud=False,  # 或者不传这个参数，默认是 False
)
```

### Q5: 云浏览器的配额限制

**问题：** API 配额用完了怎么办

**解决：**
1. 查看 Browser-Use 官网的定价计划
2. 升级到更高的配额
3. 优化代码，减少不必要的浏览器操作

---

## 最佳实践

### 1. 合理使用 wait

```python
# ❌ 不好：固定等待时间太长
await wait(10)

# ✅ 好：根据实际需要调整等待时间
await wait(2)  # 页面加载
await wait(1)  # 动画完成
```

### 2. 及时清理会话

```python
# ✅ 使用 try-finally 确保清理
try:
    browser, tools = await init_browser_session(use_cloud=True)
    # 操作...
finally:
    await cleanup_browser_session()
```

### 3. 使用 BrowserProfile 避免重复登录

```python
# ✅ 预设 cookies，避免每次都登录
profile = BrowserProfile(
    cookies=[
        # 从之前的会话中保存的 cookies
    ]
)
browser, tools = await init_browser_session(
    use_cloud=True,
    browser_profile=profile,
)
```

### 4. 批量操作时复用会话

```python
# ✅ 一次会话处理多个任务
browser, tools = await init_browser_session(use_cloud=True)

for url in urls:
    await navigate_to_url(url)
    # 处理页面...

await cleanup_browser_session()
```

### 5. 使用 Live URL 调试

```python
# 开发时启用 Live URL 查看
# 云浏览器启动时会自动输出 Live URL
# 复制到浏览器中打开即可实时查看
```

---

## 性能优化建议

1. **减少不必要的等待**
   - 使用最小必要的等待时间
   - 避免固定的长时间等待

2. **批量处理**
   - 在一个会话中处理多个任务
   - 避免频繁创建/销毁会话

3. **合理使用截图**
   - 只在必要时截图
   - 截图会增加网络传输时间

4. **优化元素定位**
   - 使用 `get_selector_map` 一次性获取所有元素
   - 避免重复查询相同元素

---

## 技术支持

- **Browser-Use 官方文档**: https://docs.browser-use.com
- **GitHub Issues**: https://github.com/browser-use/browser-use
- **项目内部文档**: 查看 `agent/tools/builtin/baseClass.py` 的注释

---

## 更新日志

### v1.0.0 (2026-01-30)
- ✅ 初始版本
- ✅ 支持云浏览器模式
- ✅ 提供 5 个完整示例
- ✅ 完整的使用文档

---

## 许可证

本项目遵循项目主许可证。