fix(fingerprint): treat TMPDIR as strict API and fail (Issue #7877) by ada-ggf25 · Pull Request #7891 · huggingface/datasets
added 4 commits
November 29, 2025 20:08…mpCacheDir Enhanced the _TempCacheDir.__init__ method to properly respect and handle the TMPDIR environment variable when creating temporary cache directories. Changes: - Add TMPDIR environment variable detection and validation - Normalise paths to handle path resolution issues - Auto-create TMPDIR directory if it doesn't exist to prevent silent fallback to default temporary directory - Validate that TMPDIR is actually a directory before use - Explicitly pass directory to mkdtemp to ensure TMPDIR is respected even if tempfile.gettempdir() was already called and cached - Add appropriate logging for directory creation and fallback scenarios This ensures that when TMPDIR is set, the temporary cache files are created in the specified directory rather than silently falling back to the system default temporary directory.
…mpCacheDir Add test coverage for the improved TMPDIR environment variable handling in the _TempCacheDir class. These tests verify the various scenarios for TMPDIR usage and error handling. Changes: - Refactor test_fingerprint_in_multiprocessing to use Pool.map for cleaner test implementation - Add test_temp_cache_dir_with_tmpdir_nonexistent to verify TMPDIR auto-creation when directory doesn't exist - Add test_temp_cache_dir_with_tmpdir_existing to verify correct behaviour when TMPDIR exists and is valid - Add test_temp_cache_dir_without_tmpdir to verify fallback to default temporary directory when TMPDIR is not set - Add test_temp_cache_dir_tmpdir_creation_failure to verify graceful error handling and fallback when TMPDIR creation fails These tests ensure that the TMPDIR improvements work correctly across all scenarios and edge cases, including proper logging and fallback behaviour.
Refine TMPDIR-related failure tests for _TempCacheDir to assert explicit error conditions instead of fallback behaviour. Changes: - Update test_temp_cache_dir_tmpdir_creation_failure to use _TempCacheDir directly and assert that an OSError is raised with a clear TMPDIR context when directory creation fails - Introduce test_temp_cache_dir_tmpdir_not_directory to verify that pointing TMPDIR at a non-directory raises an OSError with an informative error message These tests better match the intended contract of _TempCacheDir by ensuring invalid TMPDIR configurations fail loudly with descriptive messages rather than silently falling back.
…loudly Tighten TMPDIR handling in _TempCacheDir so that invalid configurations raise clear errors instead of silently falling back to the default temporary directory. Changes: - When TMPDIR points to a non-existent directory, raise an OSError with explicit guidance to create it manually or unset TMPDIR - When TMPDIR points to a non-directory path, raise an OSError with guidance to point TMPDIR to a writable directory or unset it - Remove previous warning-and-fallback behaviour to avoid masking configuration issues This ensures that TMPDIR misconfigurations are surfaced early and clearly, aligning runtime behaviour with the stricter expectations codified in the new tests.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode characters