fix: move conditioning tensors to model device when partial loading enabled by aayushbaluni · Pull Request #8970 · invoke-ai/InvokeAI

Summary

Fixes #8850

When enable_partial_loading is true, conditioning embeddings from compel may remain on CPU while the UNet model is on CUDA. This causes a RuntimeError: Expected all tensors to be on the same device during denoising.

Root Cause

The _apply_standard_conditioning and _apply_standard_conditioning_sequentially methods pass conditioning tensors (.embeds, .pooled_embeds, .add_time_ids) directly to model_forward_callback without ensuring they are on the same device as the model input x. When partial loading offloads conditioning data to CPU to save VRAM, these tensors stay on CPU while the model expects CUDA tensors.

Fix

Before each model_forward_callback call, explicitly move conditioning tensors to x.device:

  • uncond_text.embeds and cond_text.embeds via .to(x.device)
  • added_cond_kwargs (SDXL text_embeds and time_ids) via .to(x.device)

Applied to both the batch path (_apply_standard_conditioning) and the sequential path (_apply_standard_conditioning_sequentially).

Test Plan

  1. Enable enable_partial_loading in InvokeAI config
  2. Generate an image using any SD/SDXL model
  3. Previously: crashes with RuntimeError: Expected all tensors to be on the same device
  4. Now: generation completes successfully

Made with Cursor