|
@@ -1,875 +0,0 @@
|
|
|
-# Browser-Use 登录处理完整指南
|
|
|
|
|
-
|
|
|
|
|
-## 目录
|
|
|
|
|
-- [概述](#概述)
|
|
|
|
|
-- [登录场景分类](#登录场景分类)
|
|
|
|
|
-- [三种登录处理方式](#三种登录处理方式)
|
|
|
|
|
-- [方式1: 手动登录(推荐)](#方式1-手动登录推荐)
|
|
|
|
|
-- [方式2: Cookie复用](#方式2-cookie复用)
|
|
|
|
|
-- [方式3: 自动化登录](#方式3-自动化登录)
|
|
|
|
|
-- [完整实战示例](#完整实战示例)
|
|
|
|
|
-- [登录状态检测](#登录状态检测)
|
|
|
|
|
-- [Cookie管理最佳实践](#cookie管理最佳实践)
|
|
|
|
|
-- [常见问题](#常见问题)
|
|
|
|
|
-- [安全建议](#安全建议)
|
|
|
|
|
-
|
|
|
|
|
----
|
|
|
|
|
-
|
|
|
|
|
-## 概述
|
|
|
|
|
-
|
|
|
|
|
-在使用 Browser-Use 进行网页自动化时,很多网站需要登录才能访问完整内容。本文档详细介绍如何在云浏览器模式下处理各种登录场景。
|
|
|
|
|
-
|
|
|
|
|
-### 为什么需要登录处理?
|
|
|
|
|
-
|
|
|
|
|
-- 🔒 **内容保护** - 很多网站的核心内容需要登录才能访问
|
|
|
|
|
-- 🚫 **反爬虫** - 未登录用户可能被限流或拦截
|
|
|
|
|
-- 📊 **个性化数据** - 某些数据只对登录用户可见
|
|
|
|
|
-- 🎯 **操作权限** - 发布、评论等操作需要登录
|
|
|
|
|
-
|
|
|
|
|
----
|
|
|
|
|
-
|
|
|
|
|
-## 登录场景分类
|
|
|
|
|
-
|
|
|
|
|
-### 1. 简单账号密码登录
|
|
|
|
|
-- 输入用户名/邮箱
|
|
|
|
|
-- 输入密码
|
|
|
|
|
-- 点击登录按钮
|
|
|
|
|
-- **示例**: GitHub, Twitter
|
|
|
|
|
-
|
|
|
|
|
-### 2. 扫码登录
|
|
|
|
|
-- 显示二维码
|
|
|
|
|
-- 用户扫码确认
|
|
|
|
|
-- **示例**: 微信, 小红书, 淘宝
|
|
|
|
|
-
|
|
|
|
|
-### 3. 验证码登录
|
|
|
|
|
-- 输入手机号
|
|
|
|
|
-- 接收验证码
|
|
|
|
|
-- 输入验证码
|
|
|
|
|
-- **示例**: 大部分国内网站
|
|
|
|
|
-
|
|
|
|
|
-### 4. 第三方登录
|
|
|
|
|
-- OAuth 授权
|
|
|
|
|
-- 跳转到第三方平台
|
|
|
|
|
-- **示例**: Google登录, Facebook登录
|
|
|
|
|
-
|
|
|
|
|
-### 5. 多因素认证 (MFA)
|
|
|
|
|
-- 密码 + 验证码
|
|
|
|
|
-- 密码 + 邮箱确认
|
|
|
|
|
-- **示例**: 银行网站, 企业系统
|
|
|
|
|
-
|
|
|
|
|
----
|
|
|
|
|
-
|
|
|
|
|
-## 三种登录处理方式
|
|
|
|
|
-
|
|
|
|
|
-| 方式 | 适用场景 | 优点 | 缺点 |
|
|
|
|
|
-|------|---------|------|------|
|
|
|
|
|
-| **手动登录** | 扫码、验证码、复杂登录 | 最灵活,成功率高 | 需要人工介入 |
|
|
|
|
|
-| **Cookie复用** | 频繁使用同一账号 | 快速,无需重复登录 | Cookie会过期 |
|
|
|
|
|
-| **自动化登录** | 简单账号密码登录 | 完全自动化 | 容易被反爬虫检测 |
|
|
|
|
|
-
|
|
|
|
|
----
|
|
|
|
|
-
|
|
|
|
|
-## 方式1: 手动登录(推荐)
|
|
|
|
|
-
|
|
|
|
|
-### 核心思路
|
|
|
|
|
-
|
|
|
|
|
-1. 启动云浏览器(非无头模式)
|
|
|
|
|
-2. 导航到登录页面
|
|
|
|
|
-3. 使用 `wait_for_user_action` 暂停自动化
|
|
|
|
|
-4. 用户在 Live URL 中手动完成登录
|
|
|
|
|
-5. 用户按 Enter 继续自动化流程
|
|
|
|
|
-
|
|
|
|
|
-### 完整代码示例
|
|
|
|
|
-
|
|
|
|
|
-```python
|
|
|
|
|
-import asyncio
|
|
|
|
|
-from dotenv import load_dotenv
|
|
|
|
|
-from agent.tools.builtin.baseClass import (
|
|
|
|
|
- init_browser_session,
|
|
|
|
|
- cleanup_browser_session,
|
|
|
|
|
- navigate_to_url,
|
|
|
|
|
- wait,
|
|
|
|
|
- wait_for_user_action,
|
|
|
|
|
- evaluate,
|
|
|
|
|
-)
|
|
|
|
|
-
|
|
|
|
|
-load_dotenv()
|
|
|
|
|
-
|
|
|
|
|
-async def manual_login_example():
|
|
|
|
|
- """
|
|
|
|
|
- 手动登录示例 - 适用于所有登录场景
|
|
|
|
|
- """
|
|
|
|
|
- try:
|
|
|
|
|
- # 步骤 1: 初始化云浏览器(非无头模式)
|
|
|
|
|
- print("🌐 启动云浏览器...")
|
|
|
|
|
- browser, tools = await init_browser_session(
|
|
|
|
|
- headless=False, # 关键:设置为 False
|
|
|
|
|
- use_cloud=True,
|
|
|
|
|
- )
|
|
|
|
|
-
|
|
|
|
|
- print("✅ 云浏览器已启动")
|
|
|
|
|
- print("📝 提示: 查找日志中的 '🔗 Live URL',在浏览器中打开该链接")
|
|
|
|
|
-
|
|
|
|
|
- # 步骤 2: 导航到目标网站
|
|
|
|
|
- print("\n📍 导航到小红书...")
|
|
|
|
|
- await navigate_to_url("https://www.xiaohongshu.com")
|
|
|
|
|
- await wait(3)
|
|
|
|
|
-
|
|
|
|
|
- # 步骤 3: 检查登录状态(可选)
|
|
|
|
|
- print("\n🔍 检查登录状态...")
|
|
|
|
|
- check_login_js = """
|
|
|
|
|
- (function() {
|
|
|
|
|
- // 检查是否有用户头像或用户名
|
|
|
|
|
- const userAvatar = document.querySelector('[class*="avatar"]');
|
|
|
|
|
- const userName = document.querySelector('[class*="username"]');
|
|
|
|
|
- const loginBtn = document.querySelector('[class*="login"]');
|
|
|
|
|
-
|
|
|
|
|
- return {
|
|
|
|
|
- isLoggedIn: !!(userAvatar || userName),
|
|
|
|
|
- hasLoginBtn: !!loginBtn
|
|
|
|
|
- };
|
|
|
|
|
- })()
|
|
|
|
|
- """
|
|
|
|
|
-
|
|
|
|
|
- status = await evaluate(check_login_js)
|
|
|
|
|
- print(f" 登录状态: {status.output}")
|
|
|
|
|
-
|
|
|
|
|
- # 步骤 4: 等待用户手动登录
|
|
|
|
|
- print("\n👤 等待用户登录...")
|
|
|
|
|
- print("=" * 60)
|
|
|
|
|
- print("请按以下步骤操作:")
|
|
|
|
|
- print("1. 在日志中找到 '🔗 Live URL'")
|
|
|
|
|
- print("2. 复制该 URL 并在浏览器中打开")
|
|
|
|
|
- print("3. 在 Live URL 页面中完成登录(扫码或账号密码)")
|
|
|
|
|
- print("4. 登录成功后,回到这里按 Enter 继续")
|
|
|
|
|
- print("=" * 60)
|
|
|
|
|
-
|
|
|
|
|
- await wait_for_user_action(
|
|
|
|
|
- message="请在云浏览器中完成登录,完成后按 Enter 继续",
|
|
|
|
|
- timeout=300 # 5 分钟超时
|
|
|
|
|
- )
|
|
|
|
|
-
|
|
|
|
|
- print("\n✅ 用户已确认登录完成")
|
|
|
|
|
-
|
|
|
|
|
- # 步骤 5: 验证登录状态
|
|
|
|
|
- print("\n🔍 验证登录状态...")
|
|
|
|
|
- status = await evaluate(check_login_js)
|
|
|
|
|
- print(f" 登录状态: {status.output}")
|
|
|
|
|
-
|
|
|
|
|
- # 步骤 6: 继续后续操作
|
|
|
|
|
- print("\n🎯 开始执行后续任务...")
|
|
|
|
|
- # 这里可以继续你的自动化任务
|
|
|
|
|
- # 例如:搜索、爬取数据等
|
|
|
|
|
-
|
|
|
|
|
- print("\n✅ 任务完成")
|
|
|
|
|
-
|
|
|
|
|
- except Exception as e:
|
|
|
|
|
- print(f"❌ 错误: {str(e)}")
|
|
|
|
|
- finally:
|
|
|
|
|
- await cleanup_browser_session()
|
|
|
|
|
-
|
|
|
|
|
-if __name__ == "__main__":
|
|
|
|
|
- asyncio.run(manual_login_example())
|
|
|
|
|
-```
|
|
|
|
|
-
|
|
|
|
|
-### 关键点说明
|
|
|
|
|
-
|
|
|
|
|
-1. **headless=False**: 必须设置为 False,否则无法在 Live URL 中看到页面
|
|
|
|
|
-2. **Live URL**: 云浏览器启动时会输出,格式如 `https://live.browser-use.com?wss=...`
|
|
|
|
|
-3. **wait_for_user_action**: 暂停自动化,等待用户操作
|
|
|
|
|
-4. **timeout**: 设置合理的超时时间(建议 3-5 分钟)
|
|
|
|
|
-
|
|
|
|
|
----
|
|
|
|
|
-
|
|
|
|
|
-## 方式2: Cookie复用
|
|
|
|
|
-
|
|
|
|
|
-### 核心思路
|
|
|
|
|
-
|
|
|
|
|
-1. 第一次手动登录并保存 Cookie
|
|
|
|
|
-2. 后续使用 BrowserProfile 预设 Cookie
|
|
|
|
|
-3. 跳过登录步骤,直接访问内容
|
|
|
|
|
-
|
|
|
|
|
-### 步骤 1: 首次登录并保存 Cookie
|
|
|
|
|
-
|
|
|
|
|
-```python
|
|
|
|
|
-import asyncio
|
|
|
|
|
-import json
|
|
|
|
|
-from pathlib import Path
|
|
|
|
|
-from dotenv import load_dotenv
|
|
|
|
|
-from agent.tools.builtin.baseClass import (
|
|
|
|
|
- init_browser_session,
|
|
|
|
|
- cleanup_browser_session,
|
|
|
|
|
- navigate_to_url,
|
|
|
|
|
- wait,
|
|
|
|
|
- wait_for_user_action,
|
|
|
|
|
- evaluate,
|
|
|
|
|
-)
|
|
|
|
|
-
|
|
|
|
|
-load_dotenv()
|
|
|
|
|
-
|
|
|
|
|
-async def save_cookies_after_login():
|
|
|
|
|
- """
|
|
|
|
|
- 首次登录并保存 Cookie
|
|
|
|
|
- """
|
|
|
|
|
- try:
|
|
|
|
|
- # 初始化浏览器
|
|
|
|
|
- browser, tools = await init_browser_session(
|
|
|
|
|
- headless=False,
|
|
|
|
|
- use_cloud=True,
|
|
|
|
|
- )
|
|
|
|
|
-
|
|
|
|
|
- print("✅ 云浏览器已启动")
|
|
|
|
|
-
|
|
|
|
|
- # 导航到网站
|
|
|
|
|
- await navigate_to_url("https://www.xiaohongshu.com")
|
|
|
|
|
- await wait(3)
|
|
|
|
|
-
|
|
|
|
|
- # 等待用户登录
|
|
|
|
|
- print("\n👤 请在 Live URL 中完成登录...")
|
|
|
|
|
- await wait_for_user_action(
|
|
|
|
|
- message="登录完成后按 Enter 继续",
|
|
|
|
|
- timeout=300
|
|
|
|
|
- )
|
|
|
|
|
-
|
|
|
|
|
- # 获取 Cookie
|
|
|
|
|
- print("\n💾 保存 Cookie...")
|
|
|
|
|
- get_cookies_js = """
|
|
|
|
|
- (function() {
|
|
|
|
|
- return document.cookie;
|
|
|
|
|
- })()
|
|
|
|
|
- """
|
|
|
|
|
-
|
|
|
|
|
- cookies_result = await evaluate(get_cookies_js)
|
|
|
|
|
- cookies_str = cookies_result.output
|
|
|
|
|
-
|
|
|
|
|
- # 解析 Cookie 字符串为列表
|
|
|
|
|
- cookies = []
|
|
|
|
|
- for cookie_str in cookies_str.split('; '):
|
|
|
|
|
- if '=' in cookie_str:
|
|
|
|
|
- name, value = cookie_str.split('=', 1)
|
|
|
|
|
- cookies.append({
|
|
|
|
|
- "name": name,
|
|
|
|
|
- "value": value,
|
|
|
|
|
- "domain": ".xiaohongshu.com",
|
|
|
|
|
- "path": "/",
|
|
|
|
|
- })
|
|
|
|
|
-
|
|
|
|
|
- # 保存到文件
|
|
|
|
|
- cookie_file = Path("cookies_xhs.json")
|
|
|
|
|
- with open(cookie_file, "w", encoding="utf-8") as f:
|
|
|
|
|
- json.dump(cookies, f, ensure_ascii=False, indent=2)
|
|
|
|
|
-
|
|
|
|
|
- print(f"✅ Cookie 已保存到: {cookie_file}")
|
|
|
|
|
- print(f" 共 {len(cookies)} 个 Cookie")
|
|
|
|
|
-
|
|
|
|
|
- except Exception as e:
|
|
|
|
|
- print(f"❌ 错误: {str(e)}")
|
|
|
|
|
- finally:
|
|
|
|
|
- await cleanup_browser_session()
|
|
|
|
|
-
|
|
|
|
|
-if __name__ == "__main__":
|
|
|
|
|
- asyncio.run(save_cookies_after_login())
|
|
|
|
|
-```
|
|
|
|
|
-
|
|
|
|
|
-### 步骤 2: 使用保存的 Cookie
|
|
|
|
|
-
|
|
|
|
|
-```python
|
|
|
|
|
-import asyncio
|
|
|
|
|
-import json
|
|
|
|
|
-from pathlib import Path
|
|
|
|
|
-from dotenv import load_dotenv
|
|
|
|
|
-from browser_use import BrowserProfile
|
|
|
|
|
-from agent.tools.builtin.baseClass import (
|
|
|
|
|
- init_browser_session,
|
|
|
|
|
- cleanup_browser_session,
|
|
|
|
|
- navigate_to_url,
|
|
|
|
|
- wait,
|
|
|
|
|
- evaluate,
|
|
|
|
|
-)
|
|
|
|
|
-
|
|
|
|
|
-load_dotenv()
|
|
|
|
|
-
|
|
|
|
|
-async def use_saved_cookies():
|
|
|
|
|
- """
|
|
|
|
|
- 使用保存的 Cookie 跳过登录
|
|
|
|
|
- """
|
|
|
|
|
- try:
|
|
|
|
|
- # 读取保存的 Cookie
|
|
|
|
|
- cookie_file = Path("cookies_xhs.json")
|
|
|
|
|
- if not cookie_file.exists():
|
|
|
|
|
- print("❌ Cookie 文件不存在,请先运行 save_cookies_after_login()")
|
|
|
|
|
- return
|
|
|
|
|
-
|
|
|
|
|
- with open(cookie_file, "r", encoding="utf-8") as f:
|
|
|
|
|
- cookies = json.load(f)
|
|
|
|
|
-
|
|
|
|
|
- print(f"✅ 加载了 {len(cookies)} 个 Cookie")
|
|
|
|
|
-
|
|
|
|
|
- # 创建 BrowserProfile 并预设 Cookie
|
|
|
|
|
- profile = BrowserProfile(
|
|
|
|
|
- cookies=cookies,
|
|
|
|
|
- user_agent="Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36"
|
|
|
|
|
- )
|
|
|
|
|
-
|
|
|
|
|
- # 初始化浏览器(使用 profile)
|
|
|
|
|
- browser, tools = await init_browser_session(
|
|
|
|
|
- headless=True, # 可以使用无头模式
|
|
|
|
|
- use_cloud=True,
|
|
|
|
|
- browser_profile=profile, # 传入 profile
|
|
|
|
|
- )
|
|
|
|
|
-
|
|
|
|
|
- print("✅ 云浏览器已启动(带 Cookie)")
|
|
|
|
|
-
|
|
|
|
|
- # 直接访问需要登录的页面
|
|
|
|
|
- print("\n📍 访问小红书(应该已登录)...")
|
|
|
|
|
- await navigate_to_url("https://www.xiaohongshu.com")
|
|
|
|
|
- await wait(3)
|
|
|
|
|
-
|
|
|
|
|
- # 验证登录状态
|
|
|
|
|
- print("\n🔍 验证登录状态...")
|
|
|
|
|
- check_login_js = """
|
|
|
|
|
- (function() {
|
|
|
|
|
- const userAvatar = document.querySelector('[class*="avatar"]');
|
|
|
|
|
- const userName = document.querySelector('[class*="username"]');
|
|
|
|
|
- return {
|
|
|
|
|
- isLoggedIn: !!(userAvatar || userName)
|
|
|
|
|
- };
|
|
|
|
|
- })()
|
|
|
|
|
- """
|
|
|
|
|
-
|
|
|
|
|
- status = await evaluate(check_login_js)
|
|
|
|
|
- print(f" 登录状态: {status.output}")
|
|
|
|
|
-
|
|
|
|
|
- # 继续后续操作
|
|
|
|
|
- print("\n🎯 开始执行任务...")
|
|
|
|
|
- # 你的自动化任务...
|
|
|
|
|
-
|
|
|
|
|
- print("\n✅ 任务完成")
|
|
|
|
|
-
|
|
|
|
|
- except Exception as e:
|
|
|
|
|
- print(f"❌ 错误: {str(e)}")
|
|
|
|
|
- finally:
|
|
|
|
|
- await cleanup_browser_session()
|
|
|
|
|
-
|
|
|
|
|
-if __name__ == "__main__":
|
|
|
|
|
- asyncio.run(use_saved_cookies())
|
|
|
|
|
-```
|
|
|
|
|
-
|
|
|
|
|
-### Cookie 复用的注意事项
|
|
|
|
|
-
|
|
|
|
|
-1. **Cookie 过期**: Cookie 有有效期,过期后需要重新登录
|
|
|
|
|
-2. **安全性**: Cookie 文件包含敏感信息,不要提交到 Git
|
|
|
|
|
-3. **域名匹配**: Cookie 的 domain 必须正确设置
|
|
|
|
|
-4. **定期更新**: 建议定期重新获取 Cookie
|
|
|
|
|
-
|
|
|
|
|
----
|
|
|
|
|
-
|
|
|
|
|
-## 方式3: 自动化登录
|
|
|
|
|
-
|
|
|
|
|
-### 适用场景
|
|
|
|
|
-
|
|
|
|
|
-- 简单的账号密码登录
|
|
|
|
|
-- 没有验证码或反爬虫检测
|
|
|
|
|
-- 测试环境或内部系统
|
|
|
|
|
-
|
|
|
|
|
-### 示例代码
|
|
|
|
|
-
|
|
|
|
|
-```python
|
|
|
|
|
-import asyncio
|
|
|
|
|
-from dotenv import load_dotenv
|
|
|
|
|
-from agent.tools.builtin.baseClass import (
|
|
|
|
|
- init_browser_session,
|
|
|
|
|
- cleanup_browser_session,
|
|
|
|
|
- navigate_to_url,
|
|
|
|
|
- wait,
|
|
|
|
|
- get_selector_map,
|
|
|
|
|
- input_text,
|
|
|
|
|
- click_element,
|
|
|
|
|
- evaluate,
|
|
|
|
|
-)
|
|
|
|
|
-
|
|
|
|
|
-load_dotenv()
|
|
|
|
|
-
|
|
|
|
|
-async def automated_login_example():
|
|
|
|
|
- """
|
|
|
|
|
- 自动化登录示例(仅适用于简单场景)
|
|
|
|
|
- """
|
|
|
|
|
- try:
|
|
|
|
|
- # 初始化浏览器
|
|
|
|
|
- browser, tools = await init_browser_session(
|
|
|
|
|
- headless=True,
|
|
|
|
|
- use_cloud=True,
|
|
|
|
|
- )
|
|
|
|
|
-
|
|
|
|
|
- print("✅ 云浏览器已启动")
|
|
|
|
|
-
|
|
|
|
|
- # 导航到登录页面
|
|
|
|
|
- print("\n📍 导航到登录页面...")
|
|
|
|
|
- await navigate_to_url("https://example.com/login")
|
|
|
|
|
- await wait(2)
|
|
|
|
|
-
|
|
|
|
|
- # 获取页面元素
|
|
|
|
|
- print("\n🎯 获取页面元素...")
|
|
|
|
|
- selector_result = await get_selector_map()
|
|
|
|
|
- print(f" 找到 {selector_result.long_term_memory}")
|
|
|
|
|
-
|
|
|
|
|
- # 注意:需要根据实际页面找到正确的元素索引
|
|
|
|
|
- # 这里假设:
|
|
|
|
|
- # - 索引 0 是用户名输入框
|
|
|
|
|
- # - 索引 1 是密码输入框
|
|
|
|
|
- # - 索引 2 是登录按钮
|
|
|
|
|
-
|
|
|
|
|
- # 输入用户名
|
|
|
|
|
- print("\n📝 输入用户名...")
|
|
|
|
|
- await input_text(index=0, text="your_username", clear=True)
|
|
|
|
|
- await wait(1)
|
|
|
|
|
-
|
|
|
|
|
- # 输入密码
|
|
|
|
|
- print("\n🔑 输入密码...")
|
|
|
|
|
- await input_text(index=1, text="your_password", clear=True)
|
|
|
|
|
- await wait(1)
|
|
|
|
|
-
|
|
|
|
|
- # 点击登录按钮
|
|
|
|
|
- print("\n🖱️ 点击登录按钮...")
|
|
|
|
|
- await click_element(index=2)
|
|
|
|
|
- await wait(3)
|
|
|
|
|
-
|
|
|
|
|
- # 验证登录状态
|
|
|
|
|
- print("\n🔍 验证登录状态...")
|
|
|
|
|
- check_login_js = """
|
|
|
|
|
- (function() {
|
|
|
|
|
- // 检查是否跳转到首页或有用户信息
|
|
|
|
|
- const currentUrl = window.location.href;
|
|
|
|
|
- const userInfo = document.querySelector('[class*="user"]');
|
|
|
|
|
- return {
|
|
|
|
|
- currentUrl: currentUrl,
|
|
|
|
|
- isLoggedIn: !!userInfo
|
|
|
|
|
- };
|
|
|
|
|
- })()
|
|
|
|
|
- """
|
|
|
|
|
-
|
|
|
|
|
- status = await evaluate(check_login_js)
|
|
|
|
|
- print(f" 登录状态: {status.output}")
|
|
|
|
|
-
|
|
|
|
|
- print("\n✅ 登录完成")
|
|
|
|
|
-
|
|
|
|
|
- except Exception as e:
|
|
|
|
|
- print(f"❌ 错误: {str(e)}")
|
|
|
|
|
- finally:
|
|
|
|
|
- await cleanup_browser_session()
|
|
|
|
|
-
|
|
|
|
|
-if __name__ == "__main__":
|
|
|
|
|
- asyncio.run(automated_login_example())
|
|
|
|
|
-```
|
|
|
|
|
-
|
|
|
|
|
-### 自动化登录的风险
|
|
|
|
|
-
|
|
|
|
|
-⚠️ **警告**: 自动化登录容易被检测,可能导致:
|
|
|
|
|
-- 账号被封禁
|
|
|
|
|
-- IP 被拉黑
|
|
|
|
|
-- 触发验证码
|
|
|
|
|
-- 登录失败
|
|
|
|
|
-
|
|
|
|
|
-**建议**: 优先使用手动登录或 Cookie 复用。
|
|
|
|
|
-
|
|
|
|
|
----
|
|
|
|
|
-
|
|
|
|
|
-## 完整实战示例
|
|
|
|
|
-
|
|
|
|
|
-### 小红书搜索(带登录处理)
|
|
|
|
|
-
|
|
|
|
|
-这是一个完整的实战示例,展示了如何处理小红书的登录并进行搜索。
|
|
|
|
|
-
|
|
|
|
|
-```python
|
|
|
|
|
-# 完整代码见 examples/cloud_browser_example.py 中的 example_6_xhs_search_save()
|
|
|
|
|
-```
|
|
|
|
|
-
|
|
|
|
|
-**执行流程**:
|
|
|
|
|
-
|
|
|
|
|
-1. ✅ 启动云浏览器(非无头模式)
|
|
|
|
|
-2. 📍 访问小红书首页
|
|
|
|
|
-3. 🔍 检查登录状态
|
|
|
|
|
-4. 👤 等待用户手动登录(通过 Live URL)
|
|
|
|
|
-5. ✅ 用户确认登录完成
|
|
|
|
|
-6. 🔍 执行搜索
|
|
|
|
|
-7. 📜 滚动加载更多内容
|
|
|
|
|
-8. 📊 提取搜索结果
|
|
|
|
|
-9. 💾 保存到 JSON 文件
|
|
|
|
|
-
|
|
|
|
|
-**运行方式**:
|
|
|
|
|
-
|
|
|
|
|
-```bash
|
|
|
|
|
-python examples/cloud_browser_example.py --example 6
|
|
|
|
|
-```
|
|
|
|
|
-
|
|
|
|
|
----
|
|
|
|
|
-
|
|
|
|
|
-## 登录状态检测
|
|
|
|
|
-
|
|
|
|
|
-### 通用检测方法
|
|
|
|
|
-
|
|
|
|
|
-```javascript
|
|
|
|
|
-// 方法 1: 检查用户相关元素
|
|
|
|
|
-(function() {
|
|
|
|
|
- const userAvatar = document.querySelector('[class*="avatar"]');
|
|
|
|
|
- const userName = document.querySelector('[class*="username"]');
|
|
|
|
|
- const userMenu = document.querySelector('[class*="user-menu"]');
|
|
|
|
|
- const loginBtn = document.querySelector('[class*="login"]');
|
|
|
|
|
-
|
|
|
|
|
- return {
|
|
|
|
|
- isLoggedIn: !!(userAvatar || userName || userMenu),
|
|
|
|
|
- hasLoginBtn: !!loginBtn
|
|
|
|
|
- };
|
|
|
|
|
-})()
|
|
|
|
|
-
|
|
|
|
|
-// 方法 2: 检查 Cookie
|
|
|
|
|
-(function() {
|
|
|
|
|
- const cookies = document.cookie;
|
|
|
|
|
- const hasSessionCookie = cookies.includes('session') ||
|
|
|
|
|
- cookies.includes('token') ||
|
|
|
|
|
- cookies.includes('auth');
|
|
|
|
|
- return {
|
|
|
|
|
- hasCookie: hasSessionCookie,
|
|
|
|
|
- cookieCount: cookies.split(';').length
|
|
|
|
|
- };
|
|
|
|
|
-})()
|
|
|
|
|
-
|
|
|
|
|
-// 方法 3: 检查 LocalStorage
|
|
|
|
|
-(function() {
|
|
|
|
|
- const hasToken = !!localStorage.getItem('token') ||
|
|
|
|
|
- !!localStorage.getItem('user') ||
|
|
|
|
|
- !!localStorage.getItem('auth');
|
|
|
|
|
- return {
|
|
|
|
|
- hasLocalStorageAuth: hasToken
|
|
|
|
|
- };
|
|
|
|
|
-})()
|
|
|
|
|
-```
|
|
|
|
|
-
|
|
|
|
|
-### 网站特定检测
|
|
|
|
|
-
|
|
|
|
|
-不同网站的登录状态检测方式不同,需要根据实际情况调整:
|
|
|
|
|
-
|
|
|
|
|
-```python
|
|
|
|
|
-# 小红书
|
|
|
|
|
-check_login_js = """
|
|
|
|
|
-(function() {
|
|
|
|
|
- const userAvatar = document.querySelector('.user-avatar');
|
|
|
|
|
- return { isLoggedIn: !!userAvatar };
|
|
|
|
|
-})()
|
|
|
|
|
-"""
|
|
|
|
|
-
|
|
|
|
|
-# 知乎
|
|
|
|
|
-check_login_js = """
|
|
|
|
|
-(function() {
|
|
|
|
|
- const userLink = document.querySelector('.AppHeader-userInfo');
|
|
|
|
|
- return { isLoggedIn: !!userLink };
|
|
|
|
|
-})()
|
|
|
|
|
-"""
|
|
|
|
|
-
|
|
|
|
|
-# 微博
|
|
|
|
|
-check_login_js = """
|
|
|
|
|
-(function() {
|
|
|
|
|
- const userName = document.querySelector('.gn_name');
|
|
|
|
|
- return { isLoggedIn: !!userName };
|
|
|
|
|
-})()
|
|
|
|
|
-"""
|
|
|
|
|
-```
|
|
|
|
|
-
|
|
|
|
|
----
|
|
|
|
|
-
|
|
|
|
|
-## Cookie管理最佳实践
|
|
|
|
|
-
|
|
|
|
|
-### 1. Cookie 存储结构
|
|
|
|
|
-
|
|
|
|
|
-```json
|
|
|
|
|
-{
|
|
|
|
|
- "website": "xiaohongshu.com",
|
|
|
|
|
- "saved_at": "2026-01-30T10:00:00",
|
|
|
|
|
- "expires_at": "2026-02-30T10:00:00",
|
|
|
|
|
- "cookies": [
|
|
|
|
|
- {
|
|
|
|
|
- "name": "session_id",
|
|
|
|
|
- "value": "abc123...",
|
|
|
|
|
- "domain": ".xiaohongshu.com",
|
|
|
|
|
- "path": "/",
|
|
|
|
|
- "secure": true,
|
|
|
|
|
- "httpOnly": true
|
|
|
|
|
- }
|
|
|
|
|
- ]
|
|
|
|
|
-}
|
|
|
|
|
-```
|
|
|
|
|
-
|
|
|
|
|
-### 2. Cookie 管理类
|
|
|
|
|
-
|
|
|
|
|
-```python
|
|
|
|
|
-import json
|
|
|
|
|
-from pathlib import Path
|
|
|
|
|
-from datetime import datetime, timedelta
|
|
|
|
|
-from typing import List, Dict, Optional
|
|
|
|
|
-
|
|
|
|
|
-class CookieManager:
|
|
|
|
|
- """Cookie 管理器"""
|
|
|
|
|
-
|
|
|
|
|
- def __init__(self, storage_dir: str = "cookies"):
|
|
|
|
|
- self.storage_dir = Path(storage_dir)
|
|
|
|
|
- self.storage_dir.mkdir(parents=True, exist_ok=True)
|
|
|
|
|
-
|
|
|
|
|
- def save_cookies(
|
|
|
|
|
- self,
|
|
|
|
|
- website: str,
|
|
|
|
|
- cookies: List[Dict],
|
|
|
|
|
- expires_days: int = 30
|
|
|
|
|
- ):
|
|
|
|
|
- """保存 Cookie"""
|
|
|
|
|
- cookie_file = self.storage_dir / f"{website}.json"
|
|
|
|
|
-
|
|
|
|
|
- data = {
|
|
|
|
|
- "website": website,
|
|
|
|
|
- "saved_at": datetime.now().isoformat(),
|
|
|
|
|
- "expires_at": (datetime.now() + timedelta(days=expires_days)).isoformat(),
|
|
|
|
|
- "cookies": cookies
|
|
|
|
|
- }
|
|
|
|
|
-
|
|
|
|
|
- with open(cookie_file, "w", encoding="utf-8") as f:
|
|
|
|
|
- json.dump(data, f, ensure_ascii=False, indent=2)
|
|
|
|
|
-
|
|
|
|
|
- print(f"✅ Cookie 已保存: {cookie_file}")
|
|
|
|
|
-
|
|
|
|
|
- def load_cookies(self, website: str) -> Optional[List[Dict]]:
|
|
|
|
|
- """加载 Cookie"""
|
|
|
|
|
- cookie_file = self.storage_dir / f"{website}.json"
|
|
|
|
|
-
|
|
|
|
|
- if not cookie_file.exists():
|
|
|
|
|
- print(f"❌ Cookie 文件不存在: {cookie_file}")
|
|
|
|
|
- return None
|
|
|
|
|
-
|
|
|
|
|
- with open(cookie_file, "r", encoding="utf-8") as f:
|
|
|
|
|
- data = json.load(f)
|
|
|
|
|
-
|
|
|
|
|
- # 检查是否过期
|
|
|
|
|
- expires_at = datetime.fromisoformat(data["expires_at"])
|
|
|
|
|
- if datetime.now() > expires_at:
|
|
|
|
|
- print(f"⚠️ Cookie 已过期: {cookie_file}")
|
|
|
|
|
- return None
|
|
|
|
|
-
|
|
|
|
|
- print(f"✅ Cookie 已加载: {len(data['cookies'])} 个")
|
|
|
|
|
- return data["cookies"]
|
|
|
|
|
-
|
|
|
|
|
- def is_valid(self, website: str) -> bool:
|
|
|
|
|
- """检查 Cookie 是否有效"""
|
|
|
|
|
- cookies = self.load_cookies(website)
|
|
|
|
|
- return cookies is not None
|
|
|
|
|
-
|
|
|
|
|
-# 使用示例
|
|
|
|
|
-cookie_manager = CookieManager()
|
|
|
|
|
-
|
|
|
|
|
-# 保存 Cookie
|
|
|
|
|
-cookie_manager.save_cookies(
|
|
|
|
|
- website="xiaohongshu",
|
|
|
|
|
- cookies=[...],
|
|
|
|
|
- expires_days=30
|
|
|
|
|
-)
|
|
|
|
|
-
|
|
|
|
|
-# 加载 Cookie
|
|
|
|
|
-cookies = cookie_manager.load_cookies("xiaohongshu")
|
|
|
|
|
-if cookies:
|
|
|
|
|
- # 使用 Cookie
|
|
|
|
|
- profile = BrowserProfile(cookies=cookies)
|
|
|
|
|
-```
|
|
|
|
|
-
|
|
|
|
|
-### 3. 自动刷新 Cookie
|
|
|
|
|
-
|
|
|
|
|
-```python
|
|
|
|
|
-async def auto_refresh_cookies(website: str, cookie_manager: CookieManager):
|
|
|
|
|
- """自动刷新 Cookie"""
|
|
|
|
|
-
|
|
|
|
|
- # 检查 Cookie 是否有效
|
|
|
|
|
- if cookie_manager.is_valid(website):
|
|
|
|
|
- print("✅ Cookie 有效,直接使用")
|
|
|
|
|
- cookies = cookie_manager.load_cookies(website)
|
|
|
|
|
- return cookies
|
|
|
|
|
-
|
|
|
|
|
- # Cookie 无效,重新登录
|
|
|
|
|
- print("⚠️ Cookie 无效,需要重新登录")
|
|
|
|
|
-
|
|
|
|
|
- # 启动浏览器并等待用户登录
|
|
|
|
|
- browser, tools = await init_browser_session(
|
|
|
|
|
- headless=False,
|
|
|
|
|
- use_cloud=True,
|
|
|
|
|
- )
|
|
|
|
|
-
|
|
|
|
|
- await navigate_to_url(f"https://www.{website}.com")
|
|
|
|
|
- await wait(3)
|
|
|
|
|
-
|
|
|
|
|
- await wait_for_user_action(
|
|
|
|
|
- message="请完成登录,然后按 Enter",
|
|
|
|
|
- timeout=300
|
|
|
|
|
- )
|
|
|
|
|
-
|
|
|
|
|
- # 获取新的 Cookie
|
|
|
|
|
- get_cookies_js = "(function() { return document.cookie; })()"
|
|
|
|
|
- cookies_result = await evaluate(get_cookies_js)
|
|
|
|
|
-
|
|
|
|
|
- # 解析并保存
|
|
|
|
|
- cookies = parse_cookies(cookies_result.output, website)
|
|
|
|
|
- cookie_manager.save_cookies(website, cookies)
|
|
|
|
|
-
|
|
|
|
|
- await cleanup_browser_session()
|
|
|
|
|
-
|
|
|
|
|
- return cookies
|
|
|
|
|
-```
|
|
|
|
|
-
|
|
|
|
|
----
|
|
|
|
|
-
|
|
|
|
|
-## 常见问题
|
|
|
|
|
-
|
|
|
|
|
-### Q1: Live URL 在哪里找?
|
|
|
|
|
-
|
|
|
|
|
-**A**: 云浏览器启动时会在日志中输出,格式如下:
|
|
|
|
|
-
|
|
|
|
|
-```
|
|
|
|
|
-INFO [cloud] 🔗 Live URL: https://live.browser-use.com?wss=https%3A%2F%2F...
|
|
|
|
|
-```
|
|
|
|
|
-
|
|
|
|
|
-复制这个 URL 并在浏览器中打开即可。
|
|
|
|
|
-
|
|
|
|
|
-### Q2: 为什么 Live URL 打不开?
|
|
|
|
|
-
|
|
|
|
|
-**A**: 可能的原因:
|
|
|
|
|
-1. 网络问题 - 检查网络连接
|
|
|
|
|
-2. 浏览器已关闭 - 确保云浏览器还在运行
|
|
|
|
|
-3. URL 复制不完整 - 确保复制了完整的 URL
|
|
|
|
|
-
|
|
|
|
|
-### Q3: Cookie 多久会过期?
|
|
|
|
|
-
|
|
|
|
|
-**A**: 不同网站的 Cookie 过期时间不同:
|
|
|
|
|
-- 会话 Cookie: 浏览器关闭后失效
|
|
|
|
|
-- 持久 Cookie: 通常 7-30 天
|
|
|
|
|
-- 建议定期(每周)重新获取 Cookie
|
|
|
|
|
-
|
|
|
|
|
-### Q4: 如何处理验证码?
|
|
|
|
|
-
|
|
|
|
|
-**A**: 验证码必须手动处理:
|
|
|
|
|
-1. 使用 `wait_for_user_action` 暂停
|
|
|
|
|
-2. 用户在 Live URL 中完成验证码
|
|
|
|
|
-3. 按 Enter 继续
|
|
|
|
|
-
|
|
|
|
|
-### Q5: 可以同时登录多个账号吗?
|
|
|
|
|
-
|
|
|
|
|
-**A**: 可以,但需要:
|
|
|
|
|
-1. 为每个账号保存独立的 Cookie 文件
|
|
|
|
|
-2. 使用不同的 BrowserProfile
|
|
|
|
|
-3. 或者使用多个浏览器会话
|
|
|
|
|
-
|
|
|
|
|
-### Q6: 登录后如何保持会话?
|
|
|
|
|
-
|
|
|
|
|
-**A**:
|
|
|
|
|
-1. 不要调用 `cleanup_browser_session()`
|
|
|
|
|
-2. 浏览器会话会一直保持
|
|
|
|
|
-3. 可以继续执行多个任务
|
|
|
|
|
-
|
|
|
|
|
----
|
|
|
|
|
-
|
|
|
|
|
-## 安全建议
|
|
|
|
|
-
|
|
|
|
|
-### 1. 保护敏感信息
|
|
|
|
|
-
|
|
|
|
|
-```python
|
|
|
|
|
-# ❌ 不要硬编码密码
|
|
|
|
|
-password = "my_password"
|
|
|
|
|
-
|
|
|
|
|
-# ✅ 使用环境变量
|
|
|
|
|
-import os
|
|
|
|
|
-password = os.getenv("MY_PASSWORD")
|
|
|
|
|
-
|
|
|
|
|
-# ✅ 使用配置文件(不提交到 Git)
|
|
|
|
|
-import json
|
|
|
|
|
-with open("config.json") as f:
|
|
|
|
|
- config = json.load(f)
|
|
|
|
|
- password = config["password"]
|
|
|
|
|
-```
|
|
|
|
|
-
|
|
|
|
|
-### 2. Cookie 文件安全
|
|
|
|
|
-
|
|
|
|
|
-```bash
|
|
|
|
|
-# .gitignore 中添加
|
|
|
|
|
-cookies/
|
|
|
|
|
-*.cookies.json
|
|
|
|
|
-config.json
|
|
|
|
|
-.env
|
|
|
|
|
-```
|
|
|
|
|
-
|
|
|
|
|
-### 3. 使用代理
|
|
|
|
|
-
|
|
|
|
|
-```python
|
|
|
|
|
-from browser_use.browser.profile import ProxySettings
|
|
|
|
|
-
|
|
|
|
|
-proxy = ProxySettings(
|
|
|
|
|
- server="http://proxy.example.com:8080",
|
|
|
|
|
- username="user",
|
|
|
|
|
- password="pass"
|
|
|
|
|
-)
|
|
|
|
|
-
|
|
|
|
|
-profile = BrowserProfile(proxy=proxy)
|
|
|
|
|
-```
|
|
|
|
|
-
|
|
|
|
|
-### 4. 限制登录频率
|
|
|
|
|
-
|
|
|
|
|
-```python
|
|
|
|
|
-import time
|
|
|
|
|
-
|
|
|
|
|
-# 避免频繁登录
|
|
|
|
|
-last_login_time = 0
|
|
|
|
|
-MIN_LOGIN_INTERVAL = 3600 # 1小时
|
|
|
|
|
-
|
|
|
|
|
-def should_login():
|
|
|
|
|
- global last_login_time
|
|
|
|
|
- current_time = time.time()
|
|
|
|
|
- if current_time - last_login_time < MIN_LOGIN_INTERVAL:
|
|
|
|
|
- return False
|
|
|
|
|
- last_login_time = current_time
|
|
|
|
|
- return True
|
|
|
|
|
-```
|
|
|
|
|
-
|
|
|
|
|
----
|
|
|
|
|
-
|
|
|
|
|
-## 总结
|
|
|
|
|
-
|
|
|
|
|
-### 登录处理决策树
|
|
|
|
|
-
|
|
|
|
|
-```
|
|
|
|
|
-需要登录?
|
|
|
|
|
-├─ 是
|
|
|
|
|
-│ ├─ 有保存的 Cookie?
|
|
|
|
|
-│ │ ├─ 是 → 使用 Cookie 复用(方式2)
|
|
|
|
|
-│ │ └─ 否 → 继续
|
|
|
|
|
-│ ├─ 是扫码/验证码登录?
|
|
|
|
|
-│ │ ├─ 是 → 使用手动登录(方式1)
|
|
|
|
|
-│ │ └─ 否 → 继续
|
|
|
|
|
-│ └─ 是简单账号密码?
|
|
|
|
|
-│ ├─ 是 → 可尝试自动化登录(方式3)
|
|
|
|
|
-│ └─ 否 → 使用手动登录(方式1)
|
|
|
|
|
-└─ 否 → 直接访问
|
|
|
|
|
-```
|
|
|
|
|
-
|
|
|
|
|
-### 推荐方案
|
|
|
|
|
-
|
|
|
|
|
-1. **首次使用**: 手动登录 + 保存 Cookie
|
|
|
|
|
-2. **日常使用**: Cookie 复用
|
|
|
|
|
-3. **Cookie 过期**: 手动登录 + 更新 Cookie
|
|
|
|
|
-
|
|
|
|
|
-### 最佳实践
|
|
|
|
|
-
|
|
|
|
|
-- ✅ 优先使用手动登录(最可靠)
|
|
|
|
|
-- ✅ 保存 Cookie 以便复用
|
|
|
|
|
-- ✅ 定期更新 Cookie
|
|
|
|
|
-- ✅ 使用非无头模式方便调试
|
|
|
|
|
-- ✅ 设置合理的超时时间
|
|
|
|
|
-- ❌ 避免硬编码密码
|
|
|
|
|
-- ❌ 不要频繁自动化登录
|
|
|
|
|
-- ❌ 不要提交 Cookie 文件到 Git
|
|
|
|
|
-
|
|
|
|
|
----
|
|
|
|
|
-
|
|
|
|
|
-**更新时间**: 2026-01-30
|
|
|
|
|
-**相关文档**: [CLOUD_BROWSER_GUIDE.md](./CLOUD_BROWSER_GUIDE.md)
|
|
|