Workaround for Windows being unable to remove tmp directories when installing GGUF files by lstein · Pull Request #8699 · invoke-ai/InvokeAI

Summary

Due to a core issue involving torch memory mapping of GGUF, attempts to install GGUF models on Windows platforms fail with an error that rmtree() cannot remove the temporary directory created to hold the downloaded model. This PR fixes the issue in several ways:

  1. It replaces torch.from_numpy(tensor.data) in the GGUF reader with torch.from_numpy(tensor.data.copy()). This makes a copy of the torch data so that the memory-mapped file is not opened. This is the core issue and should fix the problem. But just in case...
  2. It wraps rmtree() and move() calls in loops that call garbage collection and insert short delays over multiple tries.

Related Issues / Discussions

See Discord discussion starting around message https://discord.com/channels/1020123559063990373/1049495067846524939/1453426867733401663.

QA Instructions

On a Windows system:

  1. Pull this PR and restart the server.
  2. Go to the Model Manager and install the GGUF at https://huggingface.co/wikeeyang/SRPO-Refine-Quantized-v1.0/resolve/main/Flux1-Dev-SRPO-v1-Q4_1.gguf?download=true (or any other GGUF model supported by InvokeAI)
  3. The model should install and run properly.
  4. Examine the log file. You should see a warning like this: Failed to remove temporary directory XXXXXX: Permission Denied.... It will be removed on next server start
  5. Restart the server. You should see a warning like Removing dangling temporary directory XXXXXX

Merge Plan

Small change. Should be a simple merge.

Checklist

  • The PR has a short but descriptive title, suitable for a changelog
  • Tests added / updated (if applicable)
  • ❗Changes to a redux slice have a corresponding migration
  • Documentation added / updated (if applicable)
  • Updated What's New copy (if doing a release after this PR)