fix: fix match check for specified enqueue strategy for requests with redirect by Mantisus · Pull Request #1199 · apify/crawlee-python
Pull Request Overview
This PR fixes the match check for the enqueue strategy by ensuring that the original URL is used for comparison instead of the final URL after a redirect. Key changes include adding a new field (loaded_url) to test inputs, updating tests to assign the loaded URL, and modifying the crawler’s commit handler to use context.request.url for the URL check.
Reviewed Changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
| tests/unit/crawlers/_basic/test_basic_crawler.py | Added a new "loaded_url" field in the test input dataclass and updated test cases and request context assignment to simulate original URLs. |
| src/crawlee/crawlers/_basic/_basic_crawler.py | Changed the URL used for the match check by replacing the earlier "origin" variable with context.request.url. |