Extract Structured Data

View as Markdown

Describe what you want and the shape you want it in. The extract invocation runs inside the runtime, reads the page, and returns output validated against your schema - zod in TypeScript, Pydantic in Python.

1import { Bctrl } from "@bctrl/sdk";
2import { z } from "zod";
3
4const bctrl = new Bctrl({ apiKey: process.env.BCTRL_API_KEY! });
5
6const runtime = await bctrl.runtimes.create({ type: "browser", name: "extract-recipe" });
7await bctrl.runtimes.start(runtime.id);
8
9// Point the active tab at the page you want to read.
10await bctrl.runtimes.targets.create(runtime.id, {
11 uri: "https://news.ycombinator.com",
12 activate: true,
13});
14
15const invocation = await bctrl.runtimes.invocations.createAndWait(
16 runtime.id,
17 {
18 action: "extract",
19 instruction: "Extract the top 5 stories.",
20 schema: z.object({
21 stories: z.array(
22 z.object({
23 title: z.string(),
24 points: z.number(),
25 commentCount: z.number(),
26 })
27 ),
28 }),
29 },
30 { timeoutMs: 120_000 }
31);
32
33console.log(invocation.output); // already matches the schema
34
35await bctrl.runtimes.stop(runtime.id);

The schema is enforced server-side: if the model produces output that doesn’t validate, the invocation fails with invocation.output_validation_failed instead of handing you malformed JSON. In Python, parsed_output is the instantiated Pydantic model, not a dict.

You can also navigate with your own CDP code first and call extract on whatever page the browser is on - the invocation always acts on the active target.

Next