Fix: FileNotFoundError on Context initialization with stale cache (#5712) by Pavan-Rana · Pull Request #5740 · SQLMesh/sqlmesh

Description

Fixes #5712

When FileCache.__init__() scans the cache directory, it calls file.stat() on every file returned by glob(). In environments with persistent cache directories, a file can be deleted between the glob() call and the subsequent stat() call, resulting in a FileNotFoundError that prevents Context initialization entirely.

This fix wraps the stat() call in a try/except FileNotFoundError block, allowing the cache scan to skip stale entries gracefully rather than crashing.

Note: the issue's Option 1 suggested narrowing the glob to glob(f"{self._cache_version}*"), which would skip the startswith check. This implementation keeps the original glob("*") to preserve the existing behaviour of cleaning up files from old cache versions and expired files, while still handling the race condition.

Test Plan

Added test_file_cache_init_handles_stale_file in tests/utils/test_cache.py which:

  • Creates a cache file with the correct version prefix so stat() is forced to be called
  • Monkeypatches Path.stat to raise FileNotFoundError for that specific file, simulating the race condition
  • Asserts that FileCache.__init__() completes without raising
    All existing tests pass.

Checklist

  • I have run make style and fixed any issues
  • I have added tests for my changes (if applicable)
  • All existing tests pass (make fast-test)
  • My commits are signed off (git commit -s) per the DCO