187 lines
6.3 KiB
Text
187 lines
6.3 KiB
Text
# Browser Use
|
|
|
|
Roo Code provides sophisticated browser automation capabilities that let you interact with websites directly from VS Code. This feature enables testing web applications, automating browser tasks, and capturing screenshots without leaving your development environment.
|
|
|
|
|
|
|
|
<div style={{ position: 'relative', paddingBottom: '56.25%', height: 0, overflow: 'hidden' }}>
|
|
<iframe
|
|
src="https://www.youtube.com/embed/SJae206swxA"
|
|
style={{
|
|
position: 'absolute',
|
|
top: 0,
|
|
left: 0,
|
|
width: '100%',
|
|
height: '100%',
|
|
}}
|
|
frameBorder="0"
|
|
allow="autoplay; encrypted-media"
|
|
allowFullScreen
|
|
></iframe>
|
|
</div>
|
|
|
|
<div style={{ marginTop: '20px' }}></div>
|
|
|
|
:::caution Model Support Required
|
|
Browser Use within Roo Code requires the use of Claude Sonnet 3.5 or 3.7
|
|
:::
|
|
|
|
## How Browser Use Works
|
|
|
|
By default, Roo Code uses a built-in browser that:
|
|
- Launches automatically when you ask Roo to visit a website
|
|
- Captures screenshots of web pages
|
|
- Allows Roo to interact with web elements
|
|
- Runs invisibly in the background
|
|
|
|
All of this happens directly within VS Code, with no setup required.
|
|
|
|
## Using Browser Use
|
|
|
|
A typical browser interaction follows this pattern:
|
|
|
|
**Important:** Browser Use requires Claude Sonnet 3.5 or 3.7 model.
|
|
|
|
1. Ask Roo to visit a website
|
|
2. Roo launches the browser and shows you a screenshot
|
|
3. Request additional actions (clicking, typing, scrolling)
|
|
4. Roo closes the browser when finished
|
|
|
|
For example:
|
|
|
|
```
|
|
Open the browser and view our site.
|
|
```
|
|
|
|
```
|
|
Can you check if my website at https://roocode.com is displaying correctly?
|
|
```
|
|
|
|
```
|
|
Browse http://localhost:3000, scroll down to the bottom of the page and check if the footer information is displaying correctly.
|
|
```
|
|
|
|
<img src="/img/browser-use/browser-use-1.png" alt="Browser use example" width="300" />
|
|
|
|
## How Browser Actions Work
|
|
|
|
The browser_action tool controls a browser instance that returns screenshots and console logs after each action, allowing you to see the results of interactions.
|
|
|
|
Key characteristics:
|
|
- Each browser session must start with `launch` and end with `close`
|
|
- Only one browser action can be used per message
|
|
- While the browser is active, no other tools can be used
|
|
- You must wait for the response (screenshot and logs) before performing the next action
|
|
|
|
### Available Browser Actions
|
|
|
|
| Action | Description | When to Use |
|
|
|--------|-------------|------------|
|
|
| `launch` | Opens a browser at a URL | Starting a new browser session |
|
|
| `click` | Clicks at specific coordinates | Interacting with buttons, links, etc. |
|
|
| `type` | Types text into active element | Filling forms, search boxes |
|
|
| `scroll_down` | Scrolls down by one page | Viewing content below the fold |
|
|
| `scroll_up` | Scrolls up by one page | Returning to previous content |
|
|
| `close` | Closes the browser | Ending a browser session |
|
|
|
|
## Browser Use Configuration/Settings
|
|
|
|
:::info Default Browser Settings
|
|
- **Enable browser tool**: Enabled
|
|
- **Viewport size**: Small Desktop (900x600)
|
|
- **Screenshot quality**: 75%
|
|
- **Use remote browser connection**: Disabled
|
|
:::
|
|
|
|
### Accessing Settings
|
|
|
|
To change Browser / Computer Use settings in Roo:
|
|
|
|
1. Open Settings by clicking the gear icon <Codicon name="gear" /> → Browser / Computer Use
|
|
|
|
<img src="/img/browser-use/browser-use.png" alt="Browser settings menu" width="600" />
|
|
|
|
### Enable/Disable Browser Use
|
|
|
|
**Purpose**: Master toggle that enables Roo to interact with websites using a Puppeteer-controlled browser.
|
|
|
|
To change this setting:
|
|
1. Check or uncheck the "Enable browser tool" checkbox within your Browser / Computer Use settings
|
|
|
|
<img src="/img/browser-use/browser-use-2.png" alt="Enable browser tool setting" width="300" />
|
|
|
|
### Viewport Size
|
|
|
|
**Purpose**: Determines the resolution of the browser session Roo Code uses.
|
|
|
|
**Tradeoff**: Higher values provide a larger viewport but increase token usage.
|
|
|
|
To change this setting:
|
|
1. Click the dropdown menu under "Viewport size" within your Browser / Computer Use settings
|
|
2. Select one of the available options:
|
|
- Large Desktop (1280x800)
|
|
- Small Desktop (900x600) - Default
|
|
- Tablet (768x1024)
|
|
- Mobile (360x640)
|
|
2. Select your desired resolution.
|
|
|
|
<img src="/img/browser-use/browser-use-3.png" alt="Viewport size setting" width="600" />
|
|
|
|
### Screenshot Quality
|
|
|
|
**Purpose**: Controls the WebP compression quality of browser screenshots.
|
|
|
|
**Tradeoff**: Higher values provide clearer screenshots but increase token usage.
|
|
|
|
To change this setting:
|
|
1. Adjust the slider under "Screenshot quality" within your Browser / Computer Use settings
|
|
2. Set a value between 1-100% (default is 75%)
|
|
3. Higher values provide clearer screenshots but increase token usage:
|
|
- 40-50%: Good for basic text-based websites
|
|
- 60-70%: Balanced for most general browsing
|
|
- 80%+: Use when fine visual details are critical
|
|
|
|
<img src="/img/browser-use/browser-use-4.png" alt="Screenshot quality setting" width="600" />
|
|
|
|
### Remote Browser Connection
|
|
|
|
**Purpose**: Connect Roo to an existing Chrome browser instead of using the built-in browser.
|
|
|
|
**Benefits**:
|
|
- Works in containerized environments and remote development workflows
|
|
- Maintains authenticated sessions between browser uses
|
|
- Eliminates repetitive login steps
|
|
- Allows use of custom browser profiles with specific extensions
|
|
|
|
**Requirements**: Chrome must be running with remote debugging enabled.
|
|
|
|
To enable this feature:
|
|
1. Check the "Use remote browser connection" box in Browser / Computer Use settings
|
|
2. Click "Test Connection" to verify
|
|
|
|
<img src="/img/browser-use/browser-use-5.png" alt="Remote browser connection setting" width="600" />
|
|
|
|
#### Common Use Cases
|
|
|
|
- **DevContainers**: Connect from containerized VS Code to host Chrome browser
|
|
- **Remote Development**: Use local Chrome with remote VS Code server
|
|
- **Custom Chrome Profiles**: Use profiles with specific extensions and settings
|
|
|
|
#### Connecting to a Visible Chrome Window
|
|
|
|
Connect to a visible Chrome window to observe Roo's interactions in real-time:
|
|
|
|
**macOS**
|
|
```bash
|
|
/Applications/Google\ Chrome.app/Contents/MacOS/Google\ Chrome --remote-debugging-port=9222 --user-data-dir=/tmp/chrome-debug --no-first-run
|
|
```
|
|
|
|
**Windows**
|
|
```bash
|
|
"C:\Program Files\Google\Chrome\Application\chrome.exe" --remote-debugging-port=9222 --user-data-dir=C:\chrome-debug --no-first-run
|
|
```
|
|
|
|
**Linux**
|
|
```bash
|
|
google-chrome --remote-debugging-port=9222 --user-data-dir=/tmp/chrome-debug --no-first-run
|
|
```
|