feat: Dynamic memory snapshots by Pijukatel · Pull Request #1715 · apify/crawlee-python
Pull request overview
This PR introduces dynamic memory snapshot support to address autoscaling limitations in environments with variable memory allocations (e.g., Kubernetes burstable QoS). It adds a Ratio type that allows the autoscaler to dynamically query available system memory rather than being locked to an initial baseline.
Changes:
- Introduced
Ratiotype for representing dynamic memory as a proportion of total system memory - Modified
SnapshotterandMemorySnapshotto accept eitherByteSize(fixed) orRatio(dynamic) for memory limits - Added logic to dynamically evaluate memory overload based on current available memory when using
Ratio
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
src/crawlee/_utils/byte_size.py |
Adds Ratio Pydantic model with validation for memory ratios (0.0 < value ≤ 1.0) |
src/crawlee/_autoscaling/snapshotter.py |
Updates max_memory_size parameter to accept ByteSize | Ratio and dynamically calculates memory limits when using Ratio |
src/crawlee/_autoscaling/_types.py |
Modifies MemorySnapshot.is_overloaded to dynamically query system memory when max_memory_size is a Ratio |
tests/unit/_autoscaling/test_snapshotter.py |
Adds comprehensive test simulating memory scale-up/scale-down scenarios with mocked memory info |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.