Enhance `PlaywrightCrawler` testing with mocked Playwright API
Description
- Enhance the testing of PlaywrightCrawler by adding a mocked Playwright API.
- It will provide more isolated & stable testing environment, similar to how we use HTTPX and RESPX for
BeautifulSoupCrawler- test_beautifulsoup_crawler.py. - File: test_playwright_crawler.py
- Relevant documentation: Mock APIs.
Possible solution
Create fixtures for setting up Playwright and a mocked server that intercepts and provides predefined responses for network requests. The BrowserContext can then be used in the PlaywrightCrawler.
@pytest.fixture() async def playwright() -> AsyncGenerator[Playwright, None]: async with async_playwright() as playwright: yield playwright @pytest.fixture() async def mock_server(playwright: Playwright) -> AsyncGenerator[BrowserContext, None]: browser = await playwright.chromium.launch() context = await browser.new_context() # Intercept requests and provide mock responses async def handle_route(route: Route, request: Request) -> None: if request.url.endswith('/'): response = Response( status=200, content_type='text/html', body="""<html> <head> <title>Hello</title> </head> <body> <a href="/asdf">Link 1</a> <a href="/hjkl">Link 2</a> </body> </html>""", ) elif request.url.endswith('/asdf'): response = Response( status=200, content_type='text/html', body="""<html> <head> <title>Hello</title> </head> <body> <a href="/uiop">Link 3</a> <a href="/qwer">Link 4</a> </body> </html>""", ) else: response = Response( status=200, content_type='text/html', body="""<html> <head> <title>Hello</title> </head> <body> Insightful content </body> </html>""", ) await route.fulfill(response) await context.route('**/*', handle_route) yield context await browser.close()
The BrowserContext provided by the mock_server fixture should be used in PlaywrightCrawler, possibly via BrowserPool or BrowserPlugin.