refactor!: Refactor storage creation and caching, configuration and services by Pijukatel · Pull Request #1386 · apify/crawlee-python

added 11 commits

August 28, 2025 17:04
Rework of service_locator implicit setting of services, storages and storage creation.

@github-actions github-actions bot added t-tooling

Issues with this label are in the ownership of the tooling team.

tested

Temporary label used only programatically for some analytics.

labels

Sep 2, 2025

@Pijukatel Pijukatel changed the title feat: Rework storage creation and caching, configuration and services feat!: Rework storage creation and caching, configuration and services

Sep 10, 2025

@Pijukatel Pijukatel changed the title feat!: Rework storage creation and caching, configuration and services refactor!: Refactor storage creation and caching, configuration and services

Sep 10, 2025

vdusek

@Pijukatel

@Pijukatel

This was referenced

Sep 15, 2025

vdusek

Mantisus

Co-authored-by: Vlada Dusek <v.dusek96@gmail.com>

@Pijukatel

@Pijukatel

vdusek

Co-authored-by: Vlada Dusek <v.dusek96@gmail.com>

@Pijukatel

Mantisus

@Pijukatel @vdusek

Co-authored-by: Vlada Dusek <v.dusek96@gmail.com>

@Pijukatel Pijukatel deleted the storage-clients-and-configurations-2 branch

September 16, 2025 15:13

Pijukatel added a commit to apify/apify-sdk-python that referenced this pull request

Sep 18, 2025
…576)

### Description

- All relevant parts of `Actor` are initialized in `async init,` not in
`__init__`.
- `Actor` is considered finalized after `Actor.init` was run. This also
means that the same configuration used by the `Actor` is set in the
global `service_locator`.
- There are three valid scenarios for setting up the configuration.
- Setting global configuration in `service_locator` before the
`Actor.init`
- Having no configuration set in `service_locator` and set it through
`Actor.(configuration=...)` and running `Actor.init()`
- Having no configuration set in `service_locator` and no configuration
passed to `Actor` will create and set implicit default configuration
- Properly set `ApifyFileSystemStorageClient` as local client to support
pre-existing input file.
- Depends on apify/crawlee-python/pull/1386
- Enable caching of `ApifyStorageClient` based on `token` and
`api_public_url` and update NDU storage handling.

### Issues

Rated to: #513, #590

### Testing

- Added many new initialization tests that show possible and prohibited
use cases
https://github.com/apify/apify-sdk-python/pull/576/files#diff-d64e1d346cc84a225ace3eb1d1ca826ff1e25c77064c9b1e0145552845fa7b41
- Running benchmark actor based on this and the related Crawlee branch

---------

Co-authored-by: Vlada Dusek <v.dusek96@gmail.com>