BrowserPool | API | Crawlee for Python · Fast, reliable Python web crawlers.

Index

Methods

  • async __aexit__(exc_type, exc_value, exc_traceback): None

  • Parameters

    • exc_type: type[BaseException] | None
    • exc_value: BaseException | None
    • exc_traceback: TracebackType | None

    Returns None

  • __init__(plugins, *, operation_timeout, browser_inactive_threshold, identify_inactive_browsers_interval, close_inactive_browsers_interval, retire_browser_after_page_count): None

  • Parameters

    • optionalplugins: Sequence[BrowserPlugin] | None = None
    • optionalkeyword-onlyoperation_timeout: timedelta = timedelta(seconds=15)
    • optionalkeyword-onlybrowser_inactive_threshold: timedelta = timedelta(seconds=10)
    • optionalkeyword-onlyidentify_inactive_browsers_interval: timedelta = timedelta(seconds=20)
    • optionalkeyword-onlyclose_inactive_browsers_interval: timedelta = timedelta(seconds=30)
    • optionalkeyword-onlyretire_browser_after_page_count: int = 100

    Returns None

  • async new_page(*, page_id, browser_plugin, proxy_info): CrawleePage

  • Parameters

    • optionalkeyword-onlypage_id: str | None = None
    • optionalkeyword-onlybrowser_plugin: BrowserPlugin | None = None
    • optionalkeyword-onlyproxy_info: ProxyInfo | None = None

    Returns CrawleePage

  • async new_page_with_each_plugin(): Sequence[CrawleePage]
  • post_page_close_hook(hook): Callable[[str, BrowserController], Awaitable[None]]

  • Parameters

    • hook: Callable[[str, BrowserController], Awaitable[None]]

    Returns Callable[[str, BrowserController], Awaitable[None]]

  • post_page_create_hook(hook): Callable[[CrawleePage, BrowserController], Awaitable[None]]

  • Parameters

    • hook: Callable[[CrawleePage, BrowserController], Awaitable[None]]

    Returns Callable[[CrawleePage, BrowserController], Awaitable[None]]

  • pre_page_close_hook(hook): Callable[[CrawleePage, BrowserController], Awaitable[None]]

  • Parameters

    • hook: Callable[[CrawleePage, BrowserController], Awaitable[None]]

    Returns Callable[[CrawleePage, BrowserController], Awaitable[None]]

  • pre_page_create_hook(hook): Callable[[str, BrowserController, dict[str, Any], ProxyInfo | None], Awaitable[None]]

  • Parameters

    • hook: Callable[[str, BrowserController, dict[str, Any], ProxyInfo | None], Awaitable[None]]

    Returns Callable[[str, BrowserController, dict[str, Any], ProxyInfo | None], Awaitable[None]]

  • with_default_plugin(*, browser_type, user_data_dir, browser_launch_options, browser_new_context_options, headless, fingerprint_generator, use_incognito_pages, kwargs): BrowserPool

  • Parameters

    • optionalkeyword-onlybrowser_type: BrowserType | None = None
    • optionalkeyword-onlyuser_data_dir: (str | Path) | None = None
    • optionalkeyword-onlybrowser_launch_options: Mapping[str, Any] | None = None
    • optionalkeyword-onlybrowser_new_context_options: Mapping[str, Any] | None = None
    • optionalkeyword-onlyheadless: bool | None = None
    • optionalkeyword-onlyfingerprint_generator: FingerprintGenerator | None = None
    • optionalkeyword-onlyuse_incognito_pages: bool | None = False
    • kwargs: Any

    Returns BrowserPool

Properties