WanTransformer3DModel.from_single_file wont load Wan2.2 GGUF (NotImplementedError: Cannot copy out of meta tensor; no data)

@a-r-r-o-w

Describe the bug

I am trying to load Wan2.2 gguf transformers but unfortunately they yield a criptic error when I try to use them

repo_id = "QuantStack/Wan2.2-I2V-A14B-GGUF"
filename = "HighNoise/Wan2.2-I2V-A14B-HighNoise-Q2_K.gguf" 

gguf_path = hf_hub_download(repo_id=repo_id, filename=filename)
transformer = WanTransformer3DModel.from_single_file(
    gguf_path,
    quantization_config=GGUFQuantizationConfig(compute_dtype=dtype),
    torch_dtype=dtype,
)

yields

Exception: Failed to load transformer from [...]/HighNoise/Wan2.2-I2V-A14B-HighNoise-Q2_K.gguf: Cannot copy out of meta tensor; no data! Please use torch.nn.Module.to_empty() instead of torch.nn.Module.to() when moving module from meta to a different device

I tried two different repositories of the GGUF files and different quantization. They all work on comfyUI, but not in diffusers. The same strategy above works fine with other models like Flux (using Flux transformer obviously)

Reproduction

#!/usr/bin/env python3
"""
Minimal reproduction of WanTransformer3DModel GGUF loading bug
Issue: Meta tensor error when loading quantized GGUF models
"""

import os
import torch
from diffusers import WanTransformer3DModel, GGUFQuantizationConfig
from huggingface_hub import hf_hub_download

def reproduce_gguf_loading_bug():
    """Minimal code to reproduce the GGUF loading error at line 144"""
    
    print("=== GGUF Loading Bug Reproduction ===")
    
    # Setup
    device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
    dtype = torch.float16 if device.type == "cuda" else torch.float32
    
    print(f"Device: {device}")
    print(f"Dtype: {dtype}")
    print(f"PyTorch version: {torch.__version__}")
    
    # Download GGUF file
    repo_id = "QuantStack/Wan2.2-I2V-A14B-GGUF"
    filename = "HighNoise/Wan2.2-I2V-A14B-HighNoise-Q2_K.gguf"  # Smallest for testing
    
    print(f"Downloading {filename}...")
    gguf_path = hf_hub_download(repo_id=repo_id, filename=filename)
    print(f"Downloaded to: {gguf_path}")
    print(f"File size: {os.path.getsize(gguf_path) / 1024**3:.2f} GB")
    
    # This is where the error occurs - line 144 equivalent
    print("Loading WanTransformer3DModel from GGUF...")
    try:
        transformer = WanTransformer3DModel.from_single_file(
            gguf_path,
            quantization_config=GGUFQuantizationConfig(compute_dtype=dtype),
            torch_dtype=dtype,
        )
        print("✅ Success!")
        return transformer
        
    except Exception as e:
        print(f"❌ ERROR: {type(e).__name__}: {e}")
        import traceback
        traceback.print_exc()
        return None

if __name__ == "__main__":
    reproduce_gguf_loading_bug()

Logs

[t+3m47s791ms] [ERROR] Traceback (most recent call last):
[t+3m47s791ms]   File "/inferencesh/apps/gpu/5zg3dm3ph35ntzewf4txeszag9/src/inference.py", line 207, in setup
[t+3m47s791ms]     transformer_high_noise = WanTransformer3DModel.from_single_file(
[t+3m47s791ms]                              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[t+3m47s791ms]   File "/inferencesh/apps/gpu/5zg3dm3ph35ntzewf4txeszag9/venv/3.12/lib/python3.12/site-packages/huggingface_hub/utils/_validators.py", line 114, in _inner_fn
[t+3m47s791ms]     return fn(*args, **kwargs)
[t+3m47s791ms]            ^^^^^^^^^^^^^^^^^^^
[t+3m47s791ms]   File "/inferencesh/apps/gpu/5zg3dm3ph35ntzewf4txeszag9/venv/3.12/lib/python3.12/site-packages/diffusers/loaders/single_file_model.py", line 458, in from_single_file
[t+3m47s791ms]     dispatch_model(model, **device_map_kwargs)
[t+3m47s792ms]   File "/inferencesh/apps/gpu/5zg3dm3ph35ntzewf4txeszag9/venv/3.12/lib/python3.12/site-packages/accelerate/big_modeling.py", line 502, in dispatch_model
[t+3m47s792ms]     model.to(device)
[t+3m47s792ms]   File "/inferencesh/apps/gpu/5zg3dm3ph35ntzewf4txeszag9/venv/3.12/lib/python3.12/site-packages/diffusers/models/modeling_utils.py", line 1446, in to
[t+3m47s792ms]     return super().to(*args, **kwargs)
[t+3m47s792ms]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^
[t+3m47s792ms]   File "/inferencesh/apps/gpu/5zg3dm3ph35ntzewf4txeszag9/venv/3.12/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1355, in to
[t+3m47s792ms]     return self._apply(convert)
[t+3m47s792ms]            ^^^^^^^^^^^^^^^^^^^^
[t+3m47s792ms]   File "/inferencesh/apps/gpu/5zg3dm3ph35ntzewf4txeszag9/venv/3.12/lib/python3.12/site-packages/torch/nn/modules/module.py", line 915, in _apply
[t+3m47s792ms]     module._apply(fn)
[t+3m47s792ms]   File "/inferencesh/apps/gpu/5zg3dm3ph35ntzewf4txeszag9/venv/3.12/lib/python3.12/site-packages/torch/nn/modules/module.py", line 915, in _apply
[t+3m47s792ms]     module._apply(fn)
[t+3m47s792ms]   File "/inferencesh/apps/gpu/5zg3dm3ph35ntzewf4txeszag9/venv/3.12/lib/python3.12/site-packages/torch/nn/modules/module.py", line 915, in _apply
[t+3m47s792ms]     module._apply(fn)
[t+3m47s792ms]   File "/inferencesh/apps/gpu/5zg3dm3ph35ntzewf4txeszag9/venv/3.12/lib/python3.12/site-packages/torch/nn/modules/module.py", line 942, in _apply
[t+3m47s792ms]     param_applied = fn(param)
[t+3m47s792ms]                     ^^^^^^^^^
[t+3m47s792ms]   File "/inferencesh/apps/gpu/5zg3dm3ph35ntzewf4txeszag9/venv/3.12/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1348, in convert
[t+3m47s792ms]     raise NotImplementedError(
[t+3m47s792ms] NotImplementedError: Cannot copy out of meta tensor; no data! Please use torch.nn.Module.to_empty() instead of torch.nn.Module.to() when moving module from meta to a different device.
[t+3m47s792ms] The above exception was the direct cause of the following exception:
[t+3m47s792ms] Traceback (most recent call last):
[t+3m47s792ms]   File "/server/tasks.py", line 24, in setup_task
[t+3m47s792ms]     await context.app.setup(metadata=context.metadata)
[t+3m47s792ms]   File "/inferencesh/apps/gpu/5zg3dm3ph35ntzewf4txeszag9/src/inference.py", line 242, in setup
[t+3m47s792ms]     raise Exception(f"Failed to load transformer from {high_noise_path}: {str(e)}") from e
[t+3m47s792ms] Exception: Failed to load transformer from /inferencesh/cache/huggingface/hub/models--QuantStack--Wan2.2-I2V-A14B-GGUF/snapshots/e5388177f336555785e70804d6c0b2a315993c96/HighNoise/Wan2.2-I2V-A14B-HighNoise-Q5_1.gguf: Cannot copy out of meta tensor; no data! Please use torch.nn.Module.to_empty() instead of torch.nn.Module.to() when moving module from meta to a different device.

System Info

🤗 Diffusers version: 0.33.1
Platform: Linux-5.15.0-136-generic-x86_64-with-glibc2.35
Running on Google Colab?: No
Python version: 3.10.12
PyTorch version (GPU?): 2.7.1+cu126 (True)
Flax version (CPU?/GPU?/TPU?): not installed (NA)
Jax version: not installed
JaxLib version: not installed
Huggingface_hub version: 0.33.0
Transformers version: 4.52.4
Accelerate version: 1.8.1
PEFT version: 0.16.0
Bitsandbytes version: not installed
Safetensors version: 0.5.3
xFormers version: not installed
Accelerator: NVIDIA A100-SXM4-80GB, 81920 MiB
NVIDIA A100-SXM4-80GB, 81920 MiB
NVIDIA A100-SXM4-80GB, 81920 MiB
NVIDIA A100-SXM4-80GB, 81920 MiB
NVIDIA A100-SXM4-80GB, 81920 MiB
NVIDIA A100-SXM4-80GB, 81920 MiB
NVIDIA A100-SXM4-80GB, 81920 MiB
NVIDIA A100-SXM4-80GB, 81920 MiB
Using GPU in script?: yes
Using distributed or parallel set-up in script?: no

Who can help?

@a-r-r-o-w @yiyixuxu

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

WanTransformer3DModel.from_single_file wont load Wan2.2 GGUF (NotImplementedError: Cannot copy out of meta tensor; no data) #12009

Describe the bug

Reproduction

Logs

System Info

Who can help?

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

WanTransformer3DModel.from_single_file wont load Wan2.2 GGUF (NotImplementedError: Cannot copy out of meta tensor; no data) #12009

Description

Describe the bug

Reproduction

Logs

System Info

Who can help?

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions