Skip to content

WanTransformer3DModel.from_single_file wont load Wan2.2 GGUF (NotImplementedError: Cannot copy out of meta tensor; no data) #12009

@luke14free

Description

@luke14free

Describe the bug

I am trying to load Wan2.2 gguf transformers but unfortunately they yield a criptic error when I try to use them

repo_id = "QuantStack/Wan2.2-I2V-A14B-GGUF"
filename = "HighNoise/Wan2.2-I2V-A14B-HighNoise-Q2_K.gguf" 

gguf_path = hf_hub_download(repo_id=repo_id, filename=filename)
transformer = WanTransformer3DModel.from_single_file(
    gguf_path,
    quantization_config=GGUFQuantizationConfig(compute_dtype=dtype),
    torch_dtype=dtype,
)

yields

Exception: Failed to load transformer from [...]/HighNoise/Wan2.2-I2V-A14B-HighNoise-Q2_K.gguf: Cannot copy out of meta tensor; no data! Please use torch.nn.Module.to_empty() instead of torch.nn.Module.to() when moving module from meta to a different device

I tried two different repositories of the GGUF files and different quantization. They all work on comfyUI, but not in diffusers. The same strategy above works fine with other models like Flux (using Flux transformer obviously)

Reproduction

#!/usr/bin/env python3
"""
Minimal reproduction of WanTransformer3DModel GGUF loading bug
Issue: Meta tensor error when loading quantized GGUF models
"""

import os
import torch
from diffusers import WanTransformer3DModel, GGUFQuantizationConfig
from huggingface_hub import hf_hub_download

def reproduce_gguf_loading_bug():
    """Minimal code to reproduce the GGUF loading error at line 144"""
    
    print("=== GGUF Loading Bug Reproduction ===")
    
    # Setup
    device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
    dtype = torch.float16 if device.type == "cuda" else torch.float32
    
    print(f"Device: {device}")
    print(f"Dtype: {dtype}")
    print(f"PyTorch version: {torch.__version__}")
    
    # Download GGUF file
    repo_id = "QuantStack/Wan2.2-I2V-A14B-GGUF"
    filename = "HighNoise/Wan2.2-I2V-A14B-HighNoise-Q2_K.gguf"  # Smallest for testing
    
    print(f"Downloading {filename}...")
    gguf_path = hf_hub_download(repo_id=repo_id, filename=filename)
    print(f"Downloaded to: {gguf_path}")
    print(f"File size: {os.path.getsize(gguf_path) / 1024**3:.2f} GB")
    
    # This is where the error occurs - line 144 equivalent
    print("Loading WanTransformer3DModel from GGUF...")
    try:
        transformer = WanTransformer3DModel.from_single_file(
            gguf_path,
            quantization_config=GGUFQuantizationConfig(compute_dtype=dtype),
            torch_dtype=dtype,
        )
        print("✅ Success!")
        return transformer
        
    except Exception as e:
        print(f"❌ ERROR: {type(e).__name__}: {e}")
        import traceback
        traceback.print_exc()
        return None

if __name__ == "__main__":
    reproduce_gguf_loading_bug()

Logs

[t+3m47s791ms] [ERROR] Traceback (most recent call last):
[t+3m47s791ms]   File "/inferencesh/apps/gpu/5zg3dm3ph35ntzewf4txeszag9/src/inference.py", line 207, in setup
[t+3m47s791ms]     transformer_high_noise = WanTransformer3DModel.from_single_file(
[t+3m47s791ms]                              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[t+3m47s791ms]   File "/inferencesh/apps/gpu/5zg3dm3ph35ntzewf4txeszag9/venv/3.12/lib/python3.12/site-packages/huggingface_hub/utils/_validators.py", line 114, in _inner_fn
[t+3m47s791ms]     return fn(*args, **kwargs)
[t+3m47s791ms]            ^^^^^^^^^^^^^^^^^^^
[t+3m47s791ms]   File "/inferencesh/apps/gpu/5zg3dm3ph35ntzewf4txeszag9/venv/3.12/lib/python3.12/site-packages/diffusers/loaders/single_file_model.py", line 458, in from_single_file
[t+3m47s791ms]     dispatch_model(model, **device_map_kwargs)
[t+3m47s792ms]   File "/inferencesh/apps/gpu/5zg3dm3ph35ntzewf4txeszag9/venv/3.12/lib/python3.12/site-packages/accelerate/big_modeling.py", line 502, in dispatch_model
[t+3m47s792ms]     model.to(device)
[t+3m47s792ms]   File "/inferencesh/apps/gpu/5zg3dm3ph35ntzewf4txeszag9/venv/3.12/lib/python3.12/site-packages/diffusers/models/modeling_utils.py", line 1446, in to
[t+3m47s792ms]     return super().to(*args, **kwargs)
[t+3m47s792ms]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^
[t+3m47s792ms]   File "/inferencesh/apps/gpu/5zg3dm3ph35ntzewf4txeszag9/venv/3.12/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1355, in to
[t+3m47s792ms]     return self._apply(convert)
[t+3m47s792ms]            ^^^^^^^^^^^^^^^^^^^^
[t+3m47s792ms]   File "/inferencesh/apps/gpu/5zg3dm3ph35ntzewf4txeszag9/venv/3.12/lib/python3.12/site-packages/torch/nn/modules/module.py", line 915, in _apply
[t+3m47s792ms]     module._apply(fn)
[t+3m47s792ms]   File "/inferencesh/apps/gpu/5zg3dm3ph35ntzewf4txeszag9/venv/3.12/lib/python3.12/site-packages/torch/nn/modules/module.py", line 915, in _apply
[t+3m47s792ms]     module._apply(fn)
[t+3m47s792ms]   File "/inferencesh/apps/gpu/5zg3dm3ph35ntzewf4txeszag9/venv/3.12/lib/python3.12/site-packages/torch/nn/modules/module.py", line 915, in _apply
[t+3m47s792ms]     module._apply(fn)
[t+3m47s792ms]   File "/inferencesh/apps/gpu/5zg3dm3ph35ntzewf4txeszag9/venv/3.12/lib/python3.12/site-packages/torch/nn/modules/module.py", line 942, in _apply
[t+3m47s792ms]     param_applied = fn(param)
[t+3m47s792ms]                     ^^^^^^^^^
[t+3m47s792ms]   File "/inferencesh/apps/gpu/5zg3dm3ph35ntzewf4txeszag9/venv/3.12/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1348, in convert
[t+3m47s792ms]     raise NotImplementedError(
[t+3m47s792ms] NotImplementedError: Cannot copy out of meta tensor; no data! Please use torch.nn.Module.to_empty() instead of torch.nn.Module.to() when moving module from meta to a different device.
[t+3m47s792ms] The above exception was the direct cause of the following exception:
[t+3m47s792ms] Traceback (most recent call last):
[t+3m47s792ms]   File "/server/tasks.py", line 24, in setup_task
[t+3m47s792ms]     await context.app.setup(metadata=context.metadata)
[t+3m47s792ms]   File "/inferencesh/apps/gpu/5zg3dm3ph35ntzewf4txeszag9/src/inference.py", line 242, in setup
[t+3m47s792ms]     raise Exception(f"Failed to load transformer from {high_noise_path}: {str(e)}") from e
[t+3m47s792ms] Exception: Failed to load transformer from /inferencesh/cache/huggingface/hub/models--QuantStack--Wan2.2-I2V-A14B-GGUF/snapshots/e5388177f336555785e70804d6c0b2a315993c96/HighNoise/Wan2.2-I2V-A14B-HighNoise-Q5_1.gguf: Cannot copy out of meta tensor; no data! Please use torch.nn.Module.to_empty() instead of torch.nn.Module.to() when moving module from meta to a different device.

System Info

  • 🤗 Diffusers version: 0.33.1
  • Platform: Linux-5.15.0-136-generic-x86_64-with-glibc2.35
  • Running on Google Colab?: No
  • Python version: 3.10.12
  • PyTorch version (GPU?): 2.7.1+cu126 (True)
  • Flax version (CPU?/GPU?/TPU?): not installed (NA)
  • Jax version: not installed
  • JaxLib version: not installed
  • Huggingface_hub version: 0.33.0
  • Transformers version: 4.52.4
  • Accelerate version: 1.8.1
  • PEFT version: 0.16.0
  • Bitsandbytes version: not installed
  • Safetensors version: 0.5.3
  • xFormers version: not installed
  • Accelerator: NVIDIA A100-SXM4-80GB, 81920 MiB
    NVIDIA A100-SXM4-80GB, 81920 MiB
    NVIDIA A100-SXM4-80GB, 81920 MiB
    NVIDIA A100-SXM4-80GB, 81920 MiB
    NVIDIA A100-SXM4-80GB, 81920 MiB
    NVIDIA A100-SXM4-80GB, 81920 MiB
    NVIDIA A100-SXM4-80GB, 81920 MiB
    NVIDIA A100-SXM4-80GB, 81920 MiB
  • Using GPU in script?: yes
  • Using distributed or parallel set-up in script?: no

Who can help?

@a-r-r-o-w @yiyixuxu

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions