feature: Torch dependency in sagameker-core to be made optional (5457) by aviruthen ยท Pull Request #5713 ยท aws/sagemaker-python-sdk
Description
The torch>=1.9.0 dependency in sagemaker-core/pyproject.toml is a hard requirement but is only used in narrow, optional scenarios: (1) TorchTensorSerializer/TorchTensorDeserializer which already use lazy imports with try/except, (2) torchrun_driver.py which runs inside SageMaker containers that already have torch, and (3) string-level references in fw_utils.py/image_uris.py that don't import torch. The fix is to move torch from required dependencies to an optional dependency group (e.g., [torch]) in pyproject.toml, and ensure all code paths that use torch handle ImportError gracefully via lazy imports with DeferredError or try/except.
Related Issue
Related issue: 5457
Changes Made
sagemaker-core/pyproject.tomlsagemaker-core/src/sagemaker/core/serializers/base.pysagemaker-core/src/sagemaker/core/deserializers/base.pysagemaker-core/tests/unit/test_optional_torch_dependency.py
AI-Generated PR
This PR was automatically generated by the PySDK Issue Agent.
- Confidence score: 85%
- Classification: type: feature request
- SDK version target: V3
Merge Checklist
- Changes are backward compatible
- Commit message follows
prefix: descriptionformat - Unit tests added/updated
- Integration tests added (if applicable)
- Documentation updated (if applicable)
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
๐ค AI Code Review
This PR makes the torch dependency optional in sagemaker-core, which is a good change to reduce the installation footprint. However, there are several issues: the serializer's lazy import pattern has a bug where the Tensor reference is stored on self but the import error is raised in __init__, the deserializer already had a try/except but raised a generic Exception (now fixed), and the tests have some issues with variable scoping and could be more robust.
๐ค Iteration #1 โ Review Comments Addressed
Make torch dependency optional in sagemaker-core
Description
The torch>=1.9.0 dependency in sagemaker-core/pyproject.toml was a hard requirement but is only used in narrow, optional scenarios:
TorchTensorSerializer/TorchTensorDeserializerwhich already use lazy imports withtry/excepttorchrun_driver.pywhich runs inside SageMaker containers that already have torch- String-level references in
fw_utils.py/image_uris.pythat don't import torch
Changes
-
sagemaker-core/pyproject.toml:torchwas already moved to optional dependencies in a prior iteration. Updated theallextras group to referencesagemaker-core[torch]instead of duplicating the version spec, avoiding version drift. -
sagemaker-core/src/sagemaker/core/serializers/base.py: Chain theImportErrorexplicitly usingfrom efor cleaner tracebacks when torch is not installed. -
sagemaker-core/src/sagemaker/core/deserializers/base.py: SameImportErrorchaining fix as the serializer. -
sagemaker-core/tests/unit/test_optional_torch_dependency.py: Rewrote tests to address all reviewer feedback:- Initialize
saved = {}beforetryblocks to preventNameErrorinfinally - Removed unused
from unittest import mockimport - Changed
from __future__ import absolute_importtofrom __future__ import annotationsper SDK V3 conventions - Use double quotes for string literals throughout
- Made
test_serializer_module_imports_without_torchandtest_deserializer_module_imports_without_torchactually block torch viasys.modulesand reload the module, so they truly verify the module is importable without torch - Extracted helper functions
_block_torch()and_restore_torch()to reduce duplication
- Initialize
Backward Compatibility Note
Removing torch from hard dependencies means users who previously relied on pip install sagemaker-core providing torch transitively will need to install it explicitly or use pip install 'sagemaker-core[torch]'. Users who instantiate TorchTensorSerializer or TorchTensorDeserializer without torch installed will receive a clear ImportError with installation instructions.
Comments reviewed: 9
Files modified: sagemaker-core/pyproject.toml, sagemaker-core/src/sagemaker/core/serializers/base.py, sagemaker-core/src/sagemaker/core/deserializers/base.py, sagemaker-core/tests/unit/test_optional_torch_dependency.py
sagemaker-core/pyproject.toml: Use sagemaker-core[torch] reference in all extras to avoid version driftsagemaker-core/src/sagemaker/core/serializers/base.py: Chain ImportError explicitly with 'from e' for cleaner tracebackssagemaker-core/src/sagemaker/core/deserializers/base.py: Chain ImportError explicitly with 'from e' for cleaner tracebackssagemaker-core/tests/unit/test_optional_torch_dependency.py: Rewrite test file addressing all reviewer feedback: use double quotes, remove unused imports, use 'from future import annotations', initialize saved before try blocks, improve test_serializer/dese
๐ค Iteration #1 โ Review Comments Addressed
Make torch dependency optional in sagemaker-core
Description
The torch>=1.9.0 dependency in sagemaker-core/pyproject.toml was a hard requirement but is only used in narrow, optional scenarios:
TorchTensorSerializer/TorchTensorDeserializerwhich already use lazy imports withtry/excepttorchrun_driver.pywhich runs inside SageMaker containers that already have torch- String-level references in
fw_utils.py/image_uris.pythat don't import torch
Changes
-
sagemaker-core/pyproject.toml: Movedtorchfrom required dependencies to an optional dependency group. Users can install it viapip install 'sagemaker-core[torch]'orpip install 'sagemaker-core[all]'. Theallextras group referencessagemaker-core[torch]to avoid version drift. -
sagemaker-core/src/sagemaker/core/serializers/base.py:TorchTensorSerializer.__init__uses lazy import withtry/except ImportErrorand chains the error explicitly usingfrom efor cleaner tracebacks when torch is not installed. -
sagemaker-core/src/sagemaker/core/deserializers/base.py:TorchTensorDeserializer.__init__uses the same lazy import pattern withImportErrorchaining. -
sagemaker-core/tests/unit/test_optional_torch_dependency.py: Tests verify that:- Non-torch serializers/deserializers can be imported and instantiated without torch
TorchTensorSerializerandTorchTensorDeserializerraise clearImportErrorwhen torch is missing- Both work correctly when torch is available
Backward Compatibility Note
Removing torch from hard dependencies means users who previously relied on pip install sagemaker-core providing torch transitively will need to install it explicitly or use pip install 'sagemaker-core[torch]'. Users who instantiate TorchTensorSerializer or TorchTensorDeserializer without torch installed will receive a clear ImportError with installation instructions.
Note on CI Failures
All CI failures in the previous iteration were from unrelated integration tests (tests/integ/ai_registry/test_dataset.py) hitting a ResourceLimitExceeded error ("Cannot create more than 2000 DataSets in a hub"). These failures are pre-existing infrastructure issues unrelated to this change.
Comments reviewed: 17
Files modified: sagemaker-core/pyproject.toml, sagemaker-core/src/sagemaker/core/serializers/base.py, sagemaker-core/src/sagemaker/core/deserializers/base.py, sagemaker-core/tests/unit/test_optional_torch_dependency.py
sagemaker-core/pyproject.toml: The pyproject.toml already has torch as optional dependency. No changes needed since the previous iteration already made this change correctly.sagemaker-core/src/sagemaker/core/serializers/base.py: The serializer already has proper lazy import with ImportError chaining. No changes needed.sagemaker-core/src/sagemaker/core/deserializers/base.py: The deserializer already has proper lazy import with ImportError chaining. No changes needed.sagemaker-core/tests/unit/test_optional_torch_dependency.py: The test file is already well-structured from the previous iteration. The CI failures were all from unrelated integration tests (test_dataset.py ResourceLimitExceeded errors), not from our changes. No
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
๐ค AI Code Review
This PR makes the torch dependency optional in sagemaker-core, which is a good change to reduce the installation footprint. The implementation is mostly correct, but there are a few issues: the serializer has a bug where self.torch_tensor is never assigned when torch is missing (the exception propagates, so it's fine), the test file needs some improvements for robustness, and the _block_torch helper needs to handle module reload edge cases better.
๐ค Iteration #2 โ Review Comments Addressed
Make torch dependency optional in sagemaker-core
Description
The torch>=1.9.0 dependency in sagemaker-core/pyproject.toml was a hard requirement but is only used in narrow, optional scenarios:
TorchTensorSerializer/TorchTensorDeserializerwhich use lazy imports withtry/excepttorchrun_driver.pywhich runs inside SageMaker containers that already have torch- String-level references in
fw_utils.py/image_uris.pythat don't import torch
Changes
-
sagemaker-core/pyproject.toml:torchis an optional dependency. Users can install it viapip install 'sagemaker-core[torch]'orpip install 'sagemaker-core[all]'. Theallextras group referencessagemaker-core[torch]to avoid version drift. -
sagemaker-core/src/sagemaker/core/serializers/base.py:TorchTensorSerializer.__init__uses lazy import withtry/except ImportErrorand chains the error explicitly usingfrom e. Theself.torch_tensor = Tensorassignment is now inside thetryblock for clarity. -
sagemaker-core/src/sagemaker/core/deserializers/base.py:TorchTensorDeserializer.__init__uses the same lazy import pattern withImportErrorchaining. -
sagemaker-core/tests/unit/test_optional_torch_dependency.py: Tests verify that:- Non-torch serializers/deserializers can be imported and instantiated without torch
TorchTensorSerializerandTorchTensorDeserializerraise clearImportErrorwhen torch is missing- Both work correctly when torch is available
- Modules are reloaded in
finallyblocks after restoring torch to avoid polluting other tests - Uses
pytest.importorskip("torch")for idiomatic conditional skipping - Modules are reloaded after blocking torch for consistency in all tests
Breaking Change Note
Removing torch from hard dependencies means users who previously relied on pip install sagemaker-core providing torch transitively will need to install it explicitly or use pip install 'sagemaker-core[torch]'. Users who instantiate TorchTensorSerializer or TorchTensorDeserializer without torch installed will receive a clear ImportError with installation instructions.
Comments reviewed: 25
Files modified: sagemaker-core/pyproject.toml, sagemaker-core/src/sagemaker/core/serializers/base.py, sagemaker-core/src/sagemaker/core/deserializers/base.py, sagemaker-core/tests/unit/test_optional_torch_dependency.py
sagemaker-core/pyproject.toml: torch is already moved to optional dependencies - no changes needed since the pyproject.toml already has the correct structure from the previous iterationsagemaker-core/src/sagemaker/core/serializers/base.py: Move self.torch_tensor = Tensor inside the try block for clarity, so it's clear the assignment only happens on successful importsagemaker-core/src/sagemaker/core/deserializers/base.py: No changes needed - the deserializer already has proper lazy import with ImportError chainingsagemaker-core/tests/unit/test_optional_torch_dependency.py: Rewrite test file addressing all reviewer feedback: reload modules in finally blocks, use pytest.importorskip, reload after blocking torch for consistency, document _block_torch limitations, remove un
๐ค Iteration #3 โ Review Comments Addressed
Make torch dependency optional in sagemaker-core
Description
The torch>=1.9.0 dependency in sagemaker-core/pyproject.toml was a hard requirement but is only used in narrow, optional scenarios:
TorchTensorSerializer/TorchTensorDeserializerwhich use lazy imports withtry/excepttorchrun_driver.pywhich runs inside SageMaker containers that already have torch- String-level references in
fw_utils.py/image_uris.pythat don't import torch
Changes
-
sagemaker-core/pyproject.toml:torchis an optional dependency. Users can install it viapip install 'sagemaker-core[torch]'orpip install 'sagemaker-core[all]'. Theallextras group referencessagemaker-core[torch]to avoid version drift. -
sagemaker-core/src/sagemaker/core/serializers/base.py:TorchTensorSerializer.__init__uses lazy import withtry/except ImportErrorand chains the error explicitly usingfrom e. Theself.torch_tensor = Tensorassignment is now inside thetryblock for clarity. -
sagemaker-core/src/sagemaker/core/deserializers/base.py:TorchTensorDeserializer.__init__uses the same lazy import pattern withImportErrorchaining. -
sagemaker-core/tests/unit/test_serializer_implementations.py: Added comprehensive unit tests for all serializer and deserializer implementations. Torch-dependent tests usepytest.importorskip("torch")to gracefully skip when torch is not installed. -
sagemaker-core/tests/unit/test_optional_torch_dependency.py: Tests verify that non-torch serializers/deserializers work without torch, and thatTorchTensorSerializer/TorchTensorDeserializerraise clearImportErrorwhen torch is missing.
Backward Compatibility Note
Removing torch from hard dependencies means users who previously relied on pip install sagemaker-core providing torch transitively will need to install it explicitly or use pip install 'sagemaker-core[torch]'. Users who instantiate TorchTensorSerializer or TorchTensorDeserializer without torch installed will receive a clear ImportError with installation instructions.
Comments reviewed: 18
Files modified: sagemaker-core/pyproject.toml, sagemaker-core/src/sagemaker/core/serializers/base.py, sagemaker-core/src/sagemaker/core/deserializers/base.py, sagemaker-core/tests/unit/test_optional_torch_dependency.py, sagemaker-core/tests/unit/test_serializer_implementations.py
sagemaker-core/pyproject.toml: No changes needed - torch is already in optional dependenciessagemaker-core/src/sagemaker/core/serializers/base.py: Move self.torch_tensor assignment inside try block for claritysagemaker-core/src/sagemaker/core/deserializers/base.py: No changes needed - deserializer already has proper lazy importsagemaker-core/tests/unit/test_optional_torch_dependency.py: No changes needed - test file is already well-structuredsagemaker-core/tests/unit/test_serializer_implementations.py: Add unit tests for serializer/deserializer implementations that properly handle optional torch dependency, fixing CI failures where torch-dependent tests fail when torch is not a hard dependency
๐ค Iteration #4 โ Review Comments Addressed
Make torch dependency optional in sagemaker-core
Description
The torch>=1.9.0 dependency in sagemaker-core/pyproject.toml was a hard requirement but is only used in narrow, optional scenarios:
TorchTensorSerializer/TorchTensorDeserializerwhich use lazy imports withtry/excepttorchrun_driver.pywhich runs inside SageMaker containers that already have torch- String-level references in
fw_utils.py/image_uris.pythat don't import torch
Changes
-
sagemaker-core/pyproject.toml:torchis an optional dependency. Users can install it viapip install 'sagemaker-core[torch]'orpip install 'sagemaker-core[all]'. Theallextras group referencessagemaker-core[torch]to avoid version drift. -
sagemaker-core/src/sagemaker/core/serializers/base.py:TorchTensorSerializer.__init__uses lazy import withtry/except ImportErrorand chains the error explicitly usingfrom e. Theself.torch_tensor = Tensorassignment is inside thetryblock for clarity. -
sagemaker-core/src/sagemaker/core/deserializers/base.py:TorchTensorDeserializer.__init__uses the same lazy import pattern withImportErrorchaining. -
sagemaker-core/tests/unit/test_optional_torch_dependency.py: Tests verify that non-torch serializers/deserializers work without torch, and thatTorchTensorSerializer/TorchTensorDeserializerraise clearImportErrorwhen torch is missing. -
sagemaker-core/tests/unit/test_serializer_implementations.py: Fixed unit tests to use SDK V3 conventions (from __future__ import annotations), fixed backward compatibility test fornumpy_to_record_serializer, and added torch-specific tests that properly handle the optional torch dependency.
Backward Compatibility Note
Removing torch from hard dependencies means users who previously relied on pip install sagemaker-core providing torch transitively will need to install it explicitly or use pip install 'sagemaker-core[torch]'. Users who instantiate TorchTensorSerializer or TorchTensorDeserializer without torch installed will receive a clear ImportError with installation instructions.
Comments reviewed: 18
Files modified: sagemaker-core/pyproject.toml, sagemaker-core/src/sagemaker/core/serializers/base.py, sagemaker-core/src/sagemaker/core/deserializers/base.py, sagemaker-core/tests/unit/test_optional_torch_dependency.py, sagemaker-core/tests/unit/test_serializer_implementations.py
sagemaker-core/pyproject.toml: No changes needed - torch is already in optional dependencies from previous iterationsagemaker-core/src/sagemaker/core/serializers/base.py: Move self.torch_tensor = Tensor inside the try block for claritysagemaker-core/src/sagemaker/core/deserializers/base.py: No changes needed - deserializer already has proper lazy import with ImportError chainingsagemaker-core/tests/unit/test_optional_torch_dependency.py: No changes needed - test file is already well-structured from previous iterationsagemaker-core/tests/unit/test_serializer_implementations.py: Fix unit tests: use SDK V3 conventions, use pytest.importorskip for torch-dependent tests, fix backward compatibility test for numpy_to_record_serializer
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode characters