Add post-navigation hooks and browser lifecycle hooks
Context
Crawlee JS provides navigation hooks and browser lifecycle hooks that Python is missing. See parity report for broader context.
Gaps
1. Post-navigation hooks (main gap)
JS BrowserCrawler and HttpCrawler both support postNavigationHooks — they run after page.goto() / HTTP request completes but before the request handler. Useful for CAPTCHA detection, response validation, etc.
Python only has only pre_navigation_hook. No post-navigation equivalent exists.
2. Browser lifecycle hooks (BrowserPool)
JS BrowserPool exposes 6 hook types, these are for consideration:
preLaunchHooks/postLaunchHooks— before/after browser launchprePageCreateHooks/postPageCreateHooks— before/after new page creationprePageCloseHooks/postPageCloseHooks— before/after page close
Python's BrowserPool has no lifecycle hooks.
Reference
- JS BrowserCrawler hooks:
packages/browser-crawler/src/internals/browser-crawler.ts - JS BrowserPool hooks:
packages/browser-pool/src/browser-pool.ts - Python pre-nav hooks:
src/crawlee/crawlers/_playwright/_playwright_crawler.py - Python BrowserPool:
src/crawlee/browsers/_browser_pool.py