AI Agents

AI agents are session-native — available on any connected session without importing a separate driver. Start a session, connect any driver, and use session.stagehand.* or session.browserUse.* for AI-powered browsing. Stagehand provides structured extraction, actions, and observations. Browser-use provides autonomous task execution. The same capabilities are also exposed over HTTP via POST /v1/sessions/{id}/automation using calls like stagehand.act, stagehand.extract, stagehand.observe, stagehand.agent.execute, and browserUse.agent.execute.

Access

import { Bctrl } from '@bctrl/sdk/all';
const bctrl = new Bctrl({ apiKey: process.env.BCTRL_API_KEY! });
const connected = await bctrl.session.playwright();

// AI agents are available on the connected session
await connected.stagehand.act('Click the login button');
await connected.browserUse.agent().execute('Fill in the form');

Quick Example

import { Bctrl } from '@bctrl/sdk/all';
import { z } from 'zod';

const bctrl = new Bctrl({ apiKey: process.env.BCTRL_API_KEY! });

const connected = await bctrl.session.playwright();

// Navigate to a page
await connected.page.goto('https://example.com');

// Use Stagehand for structured AI actions
await connected.stagehand.act('Click the Sign In button');
const data = await connected.stagehand.extract('email address', z.object({
  email: z.string(),
}));

// Use Browser-use for autonomous tasks
const agent = connected.browserUse.agent();
const result = await agent.execute('Find the pricing page and extract plan names');

Methods

Stagehand

Stagehand provides structured AI interactions: perform actions, extract data, observe page elements, and run autonomous agents.

stagehand.act()

Perform an action on the page using natural language or a structured action object from observe().

Accepts either a natural language instruction string or a StagehandAction object returned by observe().

instruction

string | StagehandAction

required

Natural language instruction (e.g., “Click the login button”) or a structured action from observe().

options

ActOptions

Action options.

Show options

options.page

Page

Target a specific Playwright page instance.

result

ActResult

Action result with success status.

// Natural language
await connected.stagehand.act('Click the "Sign In" button');

// Using a structured action from observe()
const actions = await connected.stagehand.observe('login buttons');
await connected.stagehand.act(actions[0]);

stagehand.extract()

Extract structured data from the page using natural language and a Zod schema. Without arguments, returns the raw page text.

Call with no arguments for raw page text, or with an instruction + Zod schema for structured extraction.

instruction

string

What to extract (e.g., “product prices and names”).

schema

ZodType

Zod schema defining the expected output shape.

options

ExtractOptions

Extraction options.

Show options

options.page

Page

Target a specific page.

result

T | { pageText: string }

Extracted data matching the schema, or raw page text if no schema provided.

import { z } from 'zod';

const products = await connected.stagehand.extract(
  'product names and prices',
  z.object({
    products: z.array(z.object({
      name: z.string(),
      price: z.number(),
    })),
  }),
);
console.log(products.products);

stagehand.observe()

Observe the page and return a list of possible actions. Useful for discovering interactive elements before acting on them.

instruction

string

Optional instruction to focus observation (e.g., “navigation links”).

options

ObserveOptions

Observation options.

Show options

options.page

Page

Target a specific page.

result

StagehandAction[]

Array of possible actions that can be passed to act().

const actions = await connected.stagehand.observe('navigation links');
for (const action of actions) {
  console.log(action.description, action.selector);
}

stagehand.agent()

Create a reusable Stagehand agent for multi-step autonomous task execution.

config

StagehandAgentConfig

Agent configuration.

Show config

config.model

string

LLM model to use.

config.maxSteps

number

Maximum number of steps before stopping.

result

StagehandAgent

Agent instance with an execute() method.

const agent = connected.stagehand.agent({
  maxSteps: 20,
});

const result = await agent.execute('Log into the dashboard and download the report');
console.log(result.success);

agent.execute()

Execute an autonomous task with the Stagehand agent. The agent will observe, plan, and act across multiple steps.

instruction

string

required

Task to accomplish.

options

AgentExecuteOptions

Execution options.

Show options

options.page

Page

Target page.

result

StagehandAgentResult

Execution result with success status and action history.

const agent = connected.stagehand.agent();
const result = await agent.execute('Find the pricing page and list all plan names');

stagehand.getMetrics()

Get performance metrics for Stagehand operations in the current session.

result

StagehandMetrics

Metrics including token usage, latency, and step counts.

const metrics = await connected.stagehand.getMetrics();
console.log(metrics);

stagehand.getHistory()

Get the action history for Stagehand operations in the current session.

result

StagehandHistoryEntry[]

Array of past actions with timestamps and results.

const history = await connected.stagehand.getHistory();
for (const entry of history) {
  console.log(entry.action, entry.result);
}

Browser-use

Browser-use agents provide fully autonomous browsing with natural language task descriptions.

browserUse.agent()

Create a reusable Browser-use agent for natural language task execution.

config

BrowserUseAgentConfig

Agent configuration.

Show config

config.llm

string

LLM provider/model to use.

config.useVision

boolean

Enable vision-based navigation.

config.maxSteps

number

Maximum number of steps.

result

BrowserUseAgent

Agent with an execute() method.

const agent = connected.browserUse.agent({
  useVision: true,
  maxSteps: 30,
});

browserUse.codeAgent()

Create a code-based Browser-use agent that generates and executes automation code.

config

BrowserUseCodeAgentConfig

Code agent configuration.

result

BrowserUseCodeAgent

Code agent with an execute() method.

const codeAgent = connected.browserUse.codeAgent();

agent.execute()

Execute a natural language task autonomously. The agent navigates, clicks, fills forms, and extracts data to accomplish the task.

task

string

required

Natural language task description.

options

BrowserUseExecuteOptions

Execution options.

Show options

options.maxSteps

number

Override max steps for this execution.

options.inputFiles

Array

Input files to provide to the agent.

options.workspacePath

string

Workspace path for file operations.

result

BrowserUseResult

Execution result.

const agent = connected.browserUse.agent({ maxSteps: 20 });
const result = await agent.execute('Go to Hacker News and find the top 3 stories');
console.log(result);

SDK API (HTTP)

Browser Automation

Access

Quick Example

Methods

Stagehand

stagehand.act()

stagehand.extract()

stagehand.observe()

stagehand.agent()

agent.execute()

stagehand.getMetrics()

stagehand.getHistory()

Browser-use

browserUse.agent()

browserUse.codeAgent()

agent.execute()

SDK API (HTTP)

Browser Automation

​Access

​Quick Example

​Methods

​Stagehand

​stagehand.act()

​stagehand.extract()

​stagehand.observe()

​stagehand.agent()

​agent.execute()

​stagehand.getMetrics()

​stagehand.getHistory()

​Browser-use

​browserUse.agent()

​browserUse.codeAgent()

​agent.execute()

Access

Quick Example

Methods

Stagehand

stagehand.act()

stagehand.extract()

stagehand.observe()

stagehand.agent()

agent.execute()

stagehand.getMetrics()

stagehand.getHistory()

Browser-use

browserUse.agent()

browserUse.codeAgent()

agent.execute()