refactor!: update status code handling by Mantisus · Pull Request #1028 · apify/crawlee-python

Skip to content

Navigation Menu

Sign in

Appearance settings

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up

Appearance settings

Conversation

@Mantisus

Copy link

Collaborator

Description

  • additional_http_error_status_codes and ignore_http_error_status_codes have been removed from the constructor parameters for HTTP clients
  • The method _raise_for_error_status_code has been removed from HttpClient and its logic has been moved to BasicCrawler
  • Prioritized checking of status codes that indicate Session blocking. Codes (401, 403, 429) trigger retire for the Session and retry, while other 4XX codes are handled as client errors (errors without retries).
  • additional_http_error_status_codes is no longer used when checking status codes that indicate Session blocking. According to current documentation, these codes should trigger a retry, not retire for the Session and retry. Blocking status codes can be modified in SessionPool through create_session_settings.
  • Standardized error handling for status codes in both PlaywrightCrawler and HttpCrawler

Issues

Testing

  • Tests for client error status codes now use code 402.
  • A separate test has been added for 403 since in this case the number of retries is affected by the 'max_session_rotations' parameter.

@Mantisus Mantisus self-assigned this

Feb 26, 2025
Copy link

Collaborator

@Pijukatel Pijukatel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice. One tiny optional comment. Approved anyway.

@Mantisus Mantisus requested review from janbuchar and removed request for vdusek

February 27, 2025 12:08
Copy link

Collaborator

@janbuchar janbuchar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Super nice! Thank you

@Pijukatel Pijukatel merged commit 6b59471 into apify:master

Feb 28, 2025

23 checks passed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Reviewers

@janbuchar janbuchar janbuchar approved these changes

@Pijukatel Pijukatel Pijukatel approved these changes

Assignees

@Mantisus Mantisus

Labels

None yet

Projects

None yet

Milestone

No milestone

3 participants

@Mantisus @janbuchar @Pijukatel