*** title: Automation Capabilities description: Complete method reference for browser runtime automation across all drivers. ----------------------------------------------------------------------------------------- When you launch a browser runtime, you get access to two layers of functionality: 1. **Runtime base** — methods available on every runtime regardless of driver: execute steps, stop, live view, recording, events, captcha, and AI agent namespaces. 2. **Driver surface** — the full method set specific to your chosen driver (Playwright, Puppeteer, Stagehand, or Selenium). ## One endpoint, every method All driver automation goes through a single HTTP endpoint: ``` POST /v1/workspaces/{workspaceId}/execute ``` ```json { "runtime": "my-browser", "steps": [ { "call": "page.goto", "args": ["https://example.com"] }, { "call": "page.screenshot" } ] } ``` The `call` field maps to the method name. `args` is a JSON array of the method's arguments. You can batch multiple steps in one request for efficiency. **The SDK handles this automatically.** When you write `await runtime.page.goto('https://example.com')`, the SDK translates it into a structured step and sends it to the execute endpoint. You never construct the HTTP payload yourself unless you want to. ## Runtime base Every browser runtime — regardless of driver — exposes these capabilities: | Method | Description | | -------------------------- | ---------------------------------------------------- | | `runtime.run(steps)` | Execute structured automation steps directly | | `runtime.stop()` | Stop the runtime and release infrastructure | | `runtime.live(options?)` | Get a live interactive view URL | | `runtime.recording()` | Get a recording replay URL | | `runtime.state()` | Query runtime state | | `runtime.events.list()` | List runtime events | | `runtime.events.wait()` | Wait for a specific event | | `runtime.captcha.detect()` | Detect captchas on the page | | `runtime.captcha.solve()` | Solve a detected captcha | | `runtime.stagehand` | Stagehand AI agent namespace (act, extract, observe) | | `runtime.browserUse` | Browser-use AI agent namespace | See [Runtime Reference](/sdk/browser-capabilities/runtime) for full documentation. ## Driver references Each driver exposes the native API surface you'd expect, running remotely: Page, Locator, BrowserContext, Browser, Frame, ElementHandle, Mouse, Keyboard, Touchscreen. Page, Locator, Frame, ElementHandle, Mouse, Keyboard, Touchscreen. Page, Context, Locator — plus AI agent methods (act, extract, observe). WebDriver, WebElement — standard Selenium WebDriver protocol.