Skip to content

Wan 2.2 a14b i2v OOM #12011

@okaris

Description

@okaris

Describe the bug

@a-r-r-o-w @DN6 @asomoza @jlonge4

sorry to tag you all, however after @yiyixuxu 's merge of Wan2.2 PR, one of your commit's is causing the model to OOM. I was only able to narrow it down to this.

Wan2.2 PR, working

commit: #a6d9f6a1a9a9ede2c64972d83ccee192b801c4a0
Image

Latest main, OOM
commit: #56d438727036b0918b30bbe3110c5fe1634ed19d

Image
[t+1m49s004ms] [INFO] [Dispatcher] Broadcasting command: run
[t+1m49s005ms] [INFO] Starting app run
[t+1m49s005ms] Downloading URL: https://cloud.inference.sh/u/4mg21r6ta37mpaz6ktzwtt8krr/01jz02fhjefhmky16f1n5bnj9p.png to /tmp/tmphbupdgg3.png
[t+1m49s168ms]   0%|          | 0.00/786k [00:00<?, ?iB/s]
[t+1m49s168ms] 100%|██████████| 786k/786k [00:00<00:00, 45.5MiB/s]
[t+1m49s171ms] Generating video with prompt: god particle
[t+1m49s171ms] Using resolution: 720p, max area: 921600
[t+1m49s235ms] Loaded image: (1024, 576)
[t+1m49s247ms] Resized image from (1024, 576) to (1280, 720) (target area: 921600)
[t+1m49s247ms] Starting video generation...
[t+3m05s640ms]   0%|          | 0/40 [00:00<?, ?it/s]
[t+4m15s733ms]   2%|▎         | 1/40 [01:10<45:47, 70.44s/it]
[t+5m25s846ms]   5%|▌         | 2/40 [02:20<44:29, 70.24s/it]
[t+5m26s640ms]   8%|▊         | 3/40 [03:30<43:16, 70.18s/it]
[t+5m26s640ms]   8%|▊         | 3/40 [03:31<43:27, 70.48s/it]
[t+5m26s643ms] [ERROR] Traceback (most recent call last):
[t+5m26s643ms]   File "/server/tasks.py", line 50, in run_task
[t+5m26s643ms]     output = await result
[t+5m26s643ms]              ^^^^^^^^^^^^
[t+5m26s643ms]   File "/inferencesh/apps/gpu/5zg3dm3ph35ntzewf4txeszag9/src/inference.py", line 338, in run
[t+5m26s643ms]     output = self.pipe(
[t+5m26s643ms]              ^^^^^^^^^^
[t+5m26s643ms]   File "/inferencesh/apps/gpu/5zg3dm3ph35ntzewf4txeszag9/venv/3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context
[t+5m26s643ms]     return func(*args, **kwargs)
[t+5m26s643ms]            ^^^^^^^^^^^^^^^^^^^^^
[t+5m26s643ms]   File "/inferencesh/apps/gpu/5zg3dm3ph35ntzewf4txeszag9/venv/3.12/lib/python3.12/site-packages/diffusers/pipelines/wan/pipeline_wan_i2v.py", line 727, in __call__
[t+5m26s643ms]     noise_pred = current_model(
[t+5m26s643ms]                  ^^^^^^^^^^^^^^
[t+5m26s643ms]   File "/inferencesh/apps/gpu/5zg3dm3ph35ntzewf4txeszag9/venv/3.12/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1751, in _wrapped_call_impl
[t+5m26s643ms]     return self._call_impl(*args, **kwargs)
[t+5m26s643ms]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[t+5m26s643ms]   File "/inferencesh/apps/gpu/5zg3dm3ph35ntzewf4txeszag9/venv/3.12/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1762, in _call_impl
[t+5m26s643ms]     return forward_call(*args, **kwargs)
[t+5m26s643ms]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[t+5m26s643ms]   File "/inferencesh/apps/gpu/5zg3dm3ph35ntzewf4txeszag9/venv/3.12/lib/python3.12/site-packages/diffusers/models/transformers/transformer_wan.py", line 660, in forward
[t+5m26s643ms]     hidden_states = block(hidden_states, encoder_hidden_states, timestep_proj, rotary_emb)
[t+5m26s643ms]                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[t+5m26s643ms]   File "/inferencesh/apps/gpu/5zg3dm3ph35ntzewf4txeszag9/venv/3.12/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1751, in _wrapped_call_impl
[t+5m26s643ms]     return self._call_impl(*args, **kwargs)
[t+5m26s643ms]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[t+5m26s643ms]   File "/inferencesh/apps/gpu/5zg3dm3ph35ntzewf4txeszag9/venv/3.12/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1762, in _call_impl
[t+5m26s643ms]     return forward_call(*args, **kwargs)
[t+5m26s643ms]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[t+5m26s643ms]   File "/inferencesh/apps/gpu/5zg3dm3ph35ntzewf4txeszag9/venv/3.12/lib/python3.12/site-packages/diffusers/models/transformers/transformer_wan.py", line 485, in forward
[t+5m26s643ms]     norm_hidden_states = (self.norm3(hidden_states.float()) * (1 + c_scale_msa) + c_shift_msa).type_as(
[t+5m26s643ms]                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[t+5m26s643ms]   File "/inferencesh/apps/gpu/5zg3dm3ph35ntzewf4txeszag9/venv/3.12/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1751, in _wrapped_call_impl
[t+5m26s643ms]     return self._call_impl(*args, **kwargs)
[t+5m26s643ms]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[t+5m26s643ms]   File "/inferencesh/apps/gpu/5zg3dm3ph35ntzewf4txeszag9/venv/3.12/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1762, in _call_impl
[t+5m26s643ms]     return forward_call(*args, **kwargs)
[t+5m26s643ms]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[t+5m26s643ms]   File "/inferencesh/apps/gpu/5zg3dm3ph35ntzewf4txeszag9/venv/3.12/lib/python3.12/site-packages/diffusers/models/normalization.py", line 88, in forward
[t+5m26s643ms]     return F.layer_norm(
[t+5m26s643ms]            ^^^^^^^^^^^^^
[t+5m26s643ms]   File "/inferencesh/apps/gpu/5zg3dm3ph35ntzewf4txeszag9/venv/3.12/lib/python3.12/site-packages/torch/nn/functional.py", line 2910, in layer_norm
[t+5m26s643ms]     return torch.layer_norm(
[t+5m26s643ms]            ^^^^^^^^^^^^^^^^^
[t+5m26s643ms] torch.OutOfMemoryError: CUDA out of memory. Tried to allocate 1.44 GiB. GPU 0 has a total capacity of 79.14 GiB of which 1.16 GiB is free. Process 981657 has 77.96 GiB memory in use. Of the allocated memory 72.20 GiB is allocated by PyTorch, and 5.23 GiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation.  See documentation for Memory Management  (https://pytorch.org/docs/stable/notes/cuda.html#environment-variables)

Reproduction

diffusers@56d438727036b0918b30bbe3110c5fe1634ed19d

run wan2.2 i2v a14b with example code

Logs

System Info

a100/h100

Who can help?

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions