Feat[models]: Add support for Qwen Image by lstein · Pull Request #9003 · invoke-ai/InvokeAI
Adds full support for the Qwen Image Edit 2511 model architecture, including both the diffusers version (Qwen/Qwen-Image-Edit-2511) and GGUF quantized versions (unsloth/Qwen-Image-Edit-2511-GGUF). Backend changes: - Add QwenImageEdit base model type to taxonomy - Add diffusers and GGUF model config classes with detection logic - Add model loader for diffusers and GGUF formats - Add 5 invocation nodes: model loader, text/vision encoder, denoise, image-to-latents, latents-to-image - Add QwenVLEncoderField for Qwen2.5-VL vision-language encoder - Add QwenImageEditConditioningInfo and conditioning field - Add generation modes and step callback support - Add 5 starter models (full diffusers + Q2_K, Q4_K_M, Q6_K, Q8_0 GGUF) Frontend changes: - Add graph builder for linear UI generation - Register in canvas and generate enqueue hooks - Update type definitions, optimal dimensions, grid sizes - Add readiness validation, model picker grouping, clip skip config - Regenerate OpenAPI schema Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> fix: use AutoProcessor.from_pretrained to load Qwen VL processor correctly Co-authored-by: lstein <111189+lstein@users.noreply.github.com> Agent-Logs-Url: https://github.com/lstein/InvokeAI/sessions/4d4417be-0f61-4faa-a21c-16e9ce81fec7 chore: bump diffusers==0.37.1 Co-authored-by: lstein <111189+lstein@users.noreply.github.com> Agent-Logs-Url: https://github.com/lstein/InvokeAI/sessions/38a76809-d9a3-40f1-b5b3-fb56342e8e90 fix: handle multiple reference images feature: add text encoder selection to advanced section for Qwen Image Edit feat: complete Qwen Image Edit pipeline with LoRA, GGUF, quantization, and UI support Major additions: - LoRA support: loader invocation, config detection, conversion utils, prefix constants, and LayerPatcher integration in denoise with sidecar patching for GGUF models - Lightning LoRA: starter models (4-step and 8-step bf16), shift override parameter for the distilled sigma schedule - GGUF fixes: correct base class (ModelLoader), zero_cond_t=True, correct in_channels (no /4 division) - Denoise: use FlowMatchEulerDiscreteScheduler directly, proper CFG gating (skip negative when cfg<=1), reference latent pixel-space resize - I2L: resize reference image to generation dimensions before VAE encoding - Graph builder: wire LoRAs via collection loader, VAE-encode reference image as latents for spatial conditioning, pass shift/quantization params - Frontend: shift override (checkbox+slider), LoRA graph wiring, scheduler hidden for Qwen Image Edit, model switching cleanup - Starter model bundle for Qwen Image Edit - LoRA config registered in discriminated union (factory.py) - Downgrade transformers requirement back to >=4.56.0 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>