1822 connect to remote browser services by l2ysho · Pull Request #3545 · apify/crawlee
and others added 14 commits
March 18, 2026 14:46… support
# Task 1: Type Definitions & LaunchContext `isRemote` Flag
## Goal
Add the foundational types and the `isRemote` flag that all other remote browser tasks depend on.
## Dependencies
None — this is the foundation task.
## Scope
### 1. Add `isRemote` to `LaunchContext`
**File:** `packages/browser-pool/src/launch-context.ts`
- Add `isRemote?: boolean` to the `LaunchContextOptions` interface (alongside `id`, `browserPlugin`, etc.)
- Add a public readonly `isRemote: boolean` property to the `LaunchContext` class
- Set it from constructor options, defaulting to `false`
### 2. Define connect option types on PlaywrightPlugin
**File:** `packages/browser-pool/src/playwright/playwright-plugin.ts`
Add the following type to the plugin file (or a co-located types file):
```typescript
// Mirrors browserType.connectOverCDP(endpointURL, options)
interface PlaywrightConnectOverCDPOptions {
endpointURL: string;
options?: Parameters<BrowserType['connectOverCDP']>[1];
}
// Mirrors browserType.connect(wsEndpoint, options)
interface PlaywrightConnectOptions {
wsEndpoint: string;
options?: Parameters<BrowserType['connect']>[1];
}
```
Use the existing `Parameters` utility type pattern (see how `SafeParameters` is used elsewhere in the codebase) — do NOT redefine Playwright's types manually.
### 3. Define connect option types on PuppeteerPlugin
**File:** `packages/browser-pool/src/puppeteer/puppeteer-plugin.ts`
```typescript
// Mirrors puppeteer.connect({ browserWSEndpoint, ...rest })
// Flat object matching Puppeteer's ConnectOptions
type PuppeteerConnectOverCDPOptions = Parameters<typeof puppeteer.connect>[0];
```
Use the `Parameters` pattern to extract the type from Puppeteer's `connect` method.
### 4. Add connect option fields to `BrowserPluginOptions`
**File:** `packages/browser-pool/src/abstract-classes/browser-plugin.ts`
This is a design choice — the PRD says connect options live on the plugin subclass, not on `LaunchContext`. Add the fields to the plugin options type so they flow through the constructor:
- `PlaywrightPlugin` options should accept `connectOptions?` and `connectOverCDPOptions?`
- `PuppeteerPlugin` options should accept `connectOverCDPOptions?`
These can be added to subclass-specific option types rather than the base `BrowserPluginOptions`.
### 5. Add connect option fields to launcher-level interfaces
**File:** `packages/playwright-crawler/src/internals/playwright-launcher.ts`
Add to `PlaywrightLaunchContext`:
```typescript
connectOptions?: PlaywrightConnectOptions;
connectOverCDPOptions?: PlaywrightConnectOverCDPOptions;
```
**File:** `packages/puppeteer-crawler/src/internals/puppeteer-launcher.ts`
Add to `PuppeteerLaunchContext`:
```typescript
connectOverCDPOptions?: PuppeteerConnectOverCDPOptions;
```
This enables IDE autocomplete when users configure `launchContext` on the crawler.
### 6. Export new types
**File:** `packages/browser-pool/src/index.ts`
Export the new connect option types so they're available to consumers.
## Key Files
| File | Change |
|------|--------|
| `packages/browser-pool/src/launch-context.ts` | Add `isRemote` option + property |
| `packages/browser-pool/src/playwright/playwright-plugin.ts` | Add connect option types |
| `packages/browser-pool/src/puppeteer/puppeteer-plugin.ts` | Add connect option type |
| `packages/playwright-crawler/src/internals/playwright-launcher.ts` | Add connect options to `PlaywrightLaunchContext` |
| `packages/puppeteer-crawler/src/internals/puppeteer-launcher.ts` | Add connect options to `PuppeteerLaunchContext` |
| `packages/browser-pool/src/index.ts` | Export new types |
| `packages/browser-crawler/src/internals/browser-launcher.ts` | May need connect options on `BrowserLaunchContext` base |
## Acceptance Criteria
- [x] `LaunchContext` has `isRemote` boolean property, defaults to `false`
- [x] Connect option types are defined using library `Parameters` extraction (not manual redefinition)
- [x] `PlaywrightLaunchContext` shows `connectOptions` and `connectOverCDPOptions` in IDE autocomplete
- [x] `PuppeteerLaunchContext` shows `connectOverCDPOptions` in IDE autocomplete
- [x] New types are exported from `@crawlee/browser-pool`
- [x] TypeScript compiles with no errors
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…and `connectOverCDP()`
# Task 2: PlaywrightPlugin Remote Connection Routing
## Goal
Make `PlaywrightPlugin._launch()` branch to `connect()` or `connectOverCDP()` when remote connection options are present, instead of calling `launch()`.
## Dependencies
- Task 1 (types and `isRemote` flag)
## Scope
### 1. Store connect options on the plugin instance
**File:** `packages/browser-pool/src/playwright/playwright-plugin.ts`
- Accept `connectOptions` and `connectOverCDPOptions` in the constructor options
- Store them as instance properties
- **Validation:** If both `connectOptions` AND `connectOverCDPOptions` are provided, throw an error immediately in the constructor:
```
Cannot set both 'connectOptions' and 'connectOverCDPOptions' — pick one protocol.
```
### 2. Branch in `_launch()` for remote connections
**File:** `packages/browser-pool/src/playwright/playwright-plugin.ts`
In the existing `_launch()` method (currently lines 22-102), add branching logic **before** the existing local launch code:
```typescript
protected async _launch(launchContext: LaunchContext<...>): Promise<Browser> {
// Remote CDP connection
if (this.connectOverCDPOptions) {
const { endpointURL, options } = this.connectOverCDPOptions;
const browser = await browserType.connectOverCDP(endpointURL, options);
return browser;
}
// Remote Playwright WebSocket connection
if (this.connectOptions) {
const { wsEndpoint, options } = this.connectOptions;
const browser = await browserType.connect(wsEndpoint, options);
return browser;
}
// Existing local launch logic...
}
```
**Reference:** See `StagehandPlugin._launch()` at `packages/stagehand-crawler/src/internals/stagehand-plugin.ts:102-107` for the CDP connection pattern:
```typescript
const cdpUrl = await stagehand.connectURL();
const browser = await chromium.connectOverCDP(cdpUrl);
```
### 3. Set `isRemote` on LaunchContext
**File:** `packages/browser-pool/src/playwright/playwright-plugin.ts`
In `createLaunchContext()` (or wherever the plugin creates the LaunchContext), pass `isRemote: true` when connect options are present. This can be done by overriding `createLaunchContext()` in the subclass, or by passing it through the options.
Check how the base `BrowserPlugin.createLaunchContext()` works (at `packages/browser-pool/src/abstract-classes/browser-plugin.ts:149-174`) and determine the best insertion point.
## Key Design Decisions
- **No new abstract method:** The routing happens inside `_launch()` via internal branching, not a new `_connect()` method. This keeps the abstract interface unchanged and doesn't affect custom plugins like StagehandPlugin.
- **`browser.close()` for cleanup:** Remote browsers are closed the same way as local browsers — via `browser.close()`. No special disconnect handling.
- **No proxy server setup for remote:** The remote branch skips the local proxy server setup that exists in the current `_launch()` code.
## Key Files
| File | Change |
|------|--------|
| `packages/browser-pool/src/playwright/playwright-plugin.ts` | Constructor stores options, `_launch()` branches for remote |
## Acceptance Criteria
- [x] `PlaywrightPlugin` accepts `connectOptions` in constructor and calls `browserType.connect()` with `wsEndpoint` and `options`
- [x] `PlaywrightPlugin` accepts `connectOverCDPOptions` in constructor and calls `browserType.connectOverCDP()` with `endpointURL` and `options`
- [x] Setting both `connectOptions` and `connectOverCDPOptions` throws an error
- [x] `launchContext.isRemote` is `true` when connect options are present
- [x] Remote branch skips local proxy server setup and persistent context logic
- [x] TypeScript compiles with no errors
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…nnect()`
# Task 3: PuppeteerPlugin Remote Connection Routing
## Goal
Make `PuppeteerPlugin._launch()` branch to `puppeteer.connect()` when remote connection options (CDP) are present, instead of calling `puppeteer.launch()`.
## Dependencies
- Task 1 (types and `isRemote` flag)
## Scope
### 1. Store connect options on the plugin instance
**File:** `packages/browser-pool/src/puppeteer/puppeteer-plugin.ts`
- Accept `connectOverCDPOptions` in the constructor options
- Store as an instance property
- Puppeteer only supports CDP — there is no `connectOptions` field (Playwright-only)
### 2. Branch in `_launch()` for remote connections
**File:** `packages/browser-pool/src/puppeteer/puppeteer-plugin.ts`
In the existing `_launch()` method (currently lines 22-203), add branching logic **before** the existing local launch code:
```typescript
protected async _launch(launchContext: LaunchContext<...>): Promise<Browser> {
// Remote CDP connection
if (this.connectOverCDPOptions) {
const browser = await puppeteer.connect(this.connectOverCDPOptions);
// Wrap with the same Proxy handler for newPage() interception
// (see existing code at lines 138-200)
return wrappedBrowser;
}
// Existing local launch logic...
}
```
**Important:** Puppeteer's `connect()` takes a flat options object: `puppeteer.connect({ browserWSEndpoint, ...rest })`. This is different from Playwright's two-argument pattern. The type should match Puppeteer's `ConnectOptions`.
### 3. Handle the `newPage()` Proxy wrapper for remote
The existing `_launch()` wraps the browser in a `Proxy` that intercepts `newPage()` calls to support `useIncognitoPages` (lines 138-200). This proxy wrapper should also be applied to remote browsers so that incognito context creation works correctly.
### 4. Set `isRemote` on LaunchContext
Same pattern as Task 2 — pass `isRemote: true` when `connectOverCDPOptions` is present.
## Key Design Decisions
- **Flat options object:** Puppeteer's `connect()` API takes a single options object (not `endpointURL, options` like Playwright). The `connectOverCDPOptions` type matches this flat shape directly.
- **`browser.close()` for cleanup:** Same as Playwright — remote browsers closed via `browser.close()`, not `browser.disconnect()`.
- **`newPage()` proxy still needed:** The Proxy wrapper that intercepts `newPage()` to create incognito contexts must still wrap remote browsers.
## Key Files
| File | Change |
|------|--------|
| `packages/browser-pool/src/puppeteer/puppeteer-plugin.ts` | Constructor stores options, `_launch()` branches for remote |
## Acceptance Criteria
- [x] `PuppeteerPlugin` accepts `connectOverCDPOptions` in constructor and calls `puppeteer.connect()` with the options object
- [x] The `newPage()` Proxy wrapper is applied to remote browsers (for incognito support)
- [x] `launchContext.isRemote` is `true` when connect options are present
- [x] Remote branch skips user data directory setup, headless handling, and other local-only logic
- [x] TypeScript compiles with no errors
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…nection logging
## Goal
Make `BrowserPlugin.launch()` skip proxy injection and webdriver hiding when `launchContext.isRemote` is `true`, since these operations modify `launchOptions` which are not used for remote connections.
## Dependencies
- Task 1 (`isRemote` flag on LaunchContext)
## Scope
### 1. Skip `_addProxyToLaunchOptions()` for remote
**File:** `packages/browser-pool/src/abstract-classes/browser-plugin.ts`
In the `launch()` method, the call to `_addProxyToLaunchOptions()` is now gated on `!isRemote`:
```typescript
if (launchContext.proxyUrl && !launchContext.isRemote) {
await this._addProxyToLaunchOptions(launchContext);
}
```
### 2. Skip `_mergeArgsToHideWebdriver()` for remote
```typescript
if (!launchContext.isRemote && this._isChromiumBasedBrowser(launchContext)) {
this._mergeArgsToHideWebdriver(launchContext);
}
```
### 3. No changes to `_addProxyToLaunchOptions()` or `_mergeArgsToHideWebdriver()` themselves
The methods remain unchanged — the skip logic lives in the calling `launch()` method.
## Key Design Decisions
- **Skip at call site, not in the methods**
- **`proxyUrl` + remote triggers a warning:** Handled in Task 6 (Warnings)
- **Fingerprinting hooks are unchanged**
## Additional
- Fixed `isRemote` not being passed through base class `createLaunchContext()`
- Added info-level logs for remote connections in base class and both plugins
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…ht overloads Playwright: change PlaywrightConnectOverCDPOptions and PlaywrightConnectOptions from type aliases (all-optional fields) to interfaces with required `wsEndpoint`. Use the non-deprecated two-argument overloads in _launch(). Puppeteer: add runtime guard that throws if neither `browserWSEndpoint` nor `browserURL` is provided in connectOverCDPOptions. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…tions # Task 5: `useIncognitoPages` Defaults to `true` for Remote ## Goal When remote connection options are present and `useIncognitoPages` was not explicitly set by the user, default it to `true` and log an info message. If the user explicitly sets `false`, log a warning. ## Dependencies - Task 2 (PlaywrightPlugin stores connect options) - Task 3 (PuppeteerPlugin stores connect options) ## Scope ### 1. Preserve `undefined` vs `false` in base constructor The base `BrowserPlugin` constructor currently collapses `useIncognitoPages` to `false`. The subclass checks `options.useIncognitoPages` directly (preserves `undefined`) and overrides after `super()`. ### 2. Override default in PlaywrightPlugin constructor After the `super()` call, if connect options are present: - `undefined` → set to `true`, info log - `false` → warning log - `true` → no extra log ### 3. Override default in PuppeteerPlugin constructor Same logic, checking `connectOverCDPOptions`. ## Key Design Decisions - **Info vs warning:** Defaulting to `true` is an info message (expected behavior). Explicit `false` is a warning (user should understand implications). - **`useIncognitoPages: false` + `connect()` is not special-cased:** The warning covers this case — no additional error or fallback. - **Uses existing `this.log`:** All logging uses the inherited `BrowserPlugin.log` logger. ## Acceptance Criteria - [x] When `connectOptions` or `connectOverCDPOptions` is set and `useIncognitoPages` is not provided → defaults to `true`, info message logged - [x] When `connectOptions` or `connectOverCDPOptions` is set and `useIncognitoPages: false` → stays `false`, warning logged - [x] When `connectOptions` or `connectOverCDPOptions` is set and `useIncognitoPages: true` → stays `true`, no extra log - [x] When no connect options are set → existing behavior unchanged - [x] Base constructor preserves `undefined` vs `false` distinction Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Rename PlaywrightConnectOverCDPOptions.wsEndpoint → endpointURL to match Playwright's own terminology and avoid field conflict with inherited ConnectOverCDPOptions.endpointURL - Wrap connectOverCDP() and connect() failures with BrowserLaunchError including sanitized endpoint URL (credentials stripped) and actionable guidance - Move endpoint validation to constructors (fail fast) — Playwright validates endpointURL and wsEndpoint are non-empty, Puppeteer validates browserWSEndpoint || browserURL - Add _sanitizeEndpointForLog() to both plugins to strip credentials from URLs before including them in error messages Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…ions
- Close BrowserContext on page close when useIncognitoPages is true.
Previously contexts were only cleaned up when an anonymized proxy was
active, causing context accumulation on remote browsers without proxy.
- Clean up targetcreated listener on remote browser disconnect via
browser.once('disconnected') handler to prevent listener leaks.
- Guard anonymizeProxySugar call with proxyUrl check — skip the async
call entirely when no proxy is configured (common for remote browsers).
- Conditionally omit proxyServer from context options when no proxy is
set, instead of passing { proxyServer: undefined }.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…ket connections - Add comments in both plugin constructors explaining why options.useIncognitoPages is checked instead of this.useIncognitoPages (super() collapses undefined to false, losing the "not set" signal). - Strengthen warning for Playwright connectOptions (WebSocket) + useIncognitoPages: false — connect() returns a browser with no default context, which is more severe than just sharing cookies. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Remove spurious launchOptions warning that always fired due to framework-injected defaults, and share log instances in launchers. PRD Task 6: Warnings for Ignored & Conflicting Options - proxyUrl + remote → warning in base BrowserPlugin.launch() - useChrome + remote → warning in launcher constructors - executablePath + remote → warning in launcher constructors - useIncognitoPages: false + remote → handled by Task 5 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
PRD Task 7: Unit Tests - Connection routing (Playwright CDP/WS/local, Puppeteer CDP/local) - Validation (mutual exclusion, missing endpoints) - isRemote correctness for all plugin variants - Proxy/webdriver skipping for remote, applied for local - useIncognitoPages defaults (true for remote, false for local) - Warnings (proxyUrl, useIncognitoPages: false, CDP vs WS variants) - 40 tests, all mocked (no real browser instances) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…texts When useIncognitoPages is true (default for remote) and proxyUrl is set, the newPage handler was passing proxyServer to createBrowserContext even for remote connections. For credentialed proxies this also spun up a localhost tunnel unreachable by the remote browser. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode characters