fix: move conditioning tensors to model device when partial loading enabled by aayushbaluni · Pull Request #8970 · invoke-ai/InvokeAI
Summary
Fixes #8850
When enable_partial_loading is true, conditioning embeddings from compel may remain on CPU while the UNet model is on CUDA. This causes a RuntimeError: Expected all tensors to be on the same device during denoising.
Root Cause
The _apply_standard_conditioning and _apply_standard_conditioning_sequentially methods pass conditioning tensors (.embeds, .pooled_embeds, .add_time_ids) directly to model_forward_callback without ensuring they are on the same device as the model input x. When partial loading offloads conditioning data to CPU to save VRAM, these tensors stay on CPU while the model expects CUDA tensors.
Fix
Before each model_forward_callback call, explicitly move conditioning tensors to x.device:
uncond_text.embedsandcond_text.embedsvia.to(x.device)added_cond_kwargs(SDXLtext_embedsandtime_ids) via.to(x.device)
Applied to both the batch path (_apply_standard_conditioning) and the sequential path (_apply_standard_conditioning_sequentially).
Test Plan
- Enable
enable_partial_loadingin InvokeAI config - Generate an image using any SD/SDXL model
- Previously: crashes with
RuntimeError: Expected all tensors to be on the same device - Now: generation completes successfully
Made with Cursor