feat: add `respect_robots_txt_file` option by Mantisus · Pull Request #1162 · apify/crawlee-python

added 9 commits

April 17, 2025 15:43

@Mantisus

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

@Mantisus Mantisus marked this pull request as ready for review

April 21, 2025 22:24

janbuchar

Co-authored-by: Jan Buchar <Teyras@gmail.com>

@Mantisus

@Mantisus

vdusek

Co-authored-by: Vlada Dusek <v.dusek96@gmail.com>

@Mantisus

@Mantisus

@Mantisus

@Mantisus

@Mantisus

@Mantisus

janbuchar

@Mantisus

janbuchar

Pijukatel pushed a commit that referenced this pull request

May 2, 2025
…cording to `robots.txt` rules (#1166)

### Description

- This PR supplements #1162 by adding an `on_skipped_request` decorator
to handle references skipped according to `robots.txt` rules

### Issues

- Closes: #1160
- Related #1162

---------

Co-authored-by: Vlada Dusek <v.dusek96@gmail.com>
Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
Co-authored-by: Apify Release Bot <noreply@apify.com>