Browser-use Agent

Browser-use is an autonomous agent that can handle complex, multi-step browser tasks with minimal guidance.

Access

import { playwright } from '@bctrl/sdk';

const session = await playwright.connect({ apiKey: '...' });

// Create agent
const agent = session.browserUse.agent();

// Execute task
await agent.execute('Research and summarize the latest AI news');

agent()

Create a standard Browser-use agent.

const agent = session.browserUse.agent({
  llm: 'gpt-4o',
  useVision: true,
  maxSteps: 20
});

const result = await agent.execute('Find flights from NYC to LA for next week');

Configuration

const agent = session.browserUse.agent({
  llm: 'gpt-4o',            // LLM to use
  useVision: true,           // Enable vision capabilities
  maxSteps: 20,              // Maximum steps
});

Execute Options

const result = await agent.execute('Your task', {
  maxSteps: 30               // Override max steps
});

Return Value

interface BrowserUseResult {
  success: boolean;
  message: string;
  // Additional fields based on task
}

codeAgent()

Create an agent that generates and executes code.

const codeAgent = session.browserUse.codeAgent({
  llm: 'gpt-4o'
});

const result = await codeAgent.execute(
  'Fill out the registration form with realistic test data'
);

The code agent is useful when you need:

More precise control over actions
Complex data manipulation
Custom logic during automation

Use Cases

Research Tasks

const agent = session.browserUse.agent({
  useVision: true,
  maxSteps: 30
});

await session.page.goto('https://google.com');

const result = await agent.execute(`
  Research the top 5 AI companies in 2024.
  For each company, find:
  - Company name
  - Main product/service
  - Recent funding or valuation
  - Key differentiator

  Compile the information into a structured summary.
`);

console.log(result.message);

E-commerce Automation

const agent = session.browserUse.agent({
  maxSteps: 25
});

await session.page.goto('https://amazon.com');

await agent.execute(`
  1. Search for "wireless noise cancelling headphones"
  2. Filter by:
     - Prime eligible
     - 4+ star rating
     - Price under $200
  3. Compare the top 3 options
  4. Add the best value option to cart
  5. Proceed to checkout (stop before payment)
`);

Form Filling

const codeAgent = session.browserUse.codeAgent();

await session.page.goto('https://example.com/signup');

await codeAgent.execute(`
  Fill out the registration form with the following:
  - Name: John Smith
  - Email: [email protected]
  - Phone: (555) 123-4567
  - Address: 123 Main St, New York, NY 10001
  - Select "Marketing" for how you heard about us
  - Check the terms and conditions
  - Submit the form
`);

Data Collection

const agent = session.browserUse.agent({
  useVision: true,
  maxSteps: 50
});

await agent.execute(`
  Go to LinkedIn and search for "Software Engineer" jobs in San Francisco.
  Collect the first 10 job postings with:
  - Job title
  - Company name
  - Salary range (if shown)
  - Key requirements

  Navigate through multiple pages if needed.
`);

Vision Capabilities

When useVision: true, the agent can:

Understand page layout visually
Identify elements by their appearance
Handle dynamic content better
Work with canvas/image-based interfaces

const agent = session.browserUse.agent({
  useVision: true  // Enable visual understanding
});

await agent.execute('Click on the blue "Get Started" button in the hero section');

Best Practices

Be specific about the task

// Bad - too vague
await agent.execute('Buy something');

// Good - specific and structured
await agent.execute(`
  Search for "running shoes size 10"
  Filter by:
  - Brand: Nike
  - Price: under $150
  Add the highest-rated option to cart
`);

Set appropriate maxSteps

// Simple task
const agent = session.browserUse.agent({ maxSteps: 5 });

// Complex multi-page task
const agent = session.browserUse.agent({ maxSteps: 30 });

Handle failures gracefully

const result = await agent.execute('...');

if (!result.success) {
  console.log('Task failed:', result.message);
  // Retry or fallback
}

Stagehand vs Browser-use

Feature	Stagehand	Browser-use
Single actions	`act()`	-
Data extraction	`extract()` with schema	-
Multi-step tasks	`agent()`	`agent()`
Vision support	Limited	Full
Code generation	-	`codeAgent()`
Best for	Quick actions, extraction	Complex workflows

Use Stagehand when:

You need fast single actions
You want structured data extraction
Tasks are well-defined

Use Browser-use when:

Tasks require many steps
You need visual understanding
Workflows span multiple sites

Full Example

import { playwright } from '@bctrl/sdk';

async function competitorResearch() {
  const session = await playwright.connect({
    apiKey: process.env.BCTRL_API_KEY
  });

  const agent = session.browserUse.agent({
    llm: 'gpt-4o',
    useVision: true,
    maxSteps: 40
  });

  const result = await agent.execute(`
    Research our top 3 competitors in the browser automation space:
    1. Browserbase
    2. Browserless
    3. Playwright cloud services

    For each competitor:
    - Visit their website
    - Find their pricing page
    - Note their pricing tiers and features
    - Look for any recent announcements or blog posts

    Compile a comparison summary at the end.
  `);

  console.log('Research complete!');
  console.log(result.message);

  await session.close();
}

competitorResearch();

Get Started

Connect Resources

Automation & AI

Advanced

Browser-use Agent

Access

agent()

Configuration

Execute Options

Return Value

codeAgent()

Use Cases

Research Tasks

E-commerce Automation

Form Filling

Data Collection

Vision Capabilities

Best Practices

Stagehand vs Browser-use

Full Example

Get Started

Connect Resources

Automation & AI

Advanced

​Access

​agent()

​Configuration

​Execute Options

​Return Value

​codeAgent()

​Use Cases

​Research Tasks

​E-commerce Automation

​Form Filling

​Data Collection

​Vision Capabilities

​Best Practices

​Stagehand vs Browser-use

​Full Example

Access

agent()

Configuration

Execute Options

Return Value

codeAgent()

Use Cases

Research Tasks

E-commerce Automation

Form Filling

Data Collection

Vision Capabilities

Best Practices

Stagehand vs Browser-use

Full Example