Skip to content

[wan2.2] fix vae patches #12041

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Aug 1, 2025
Merged

[wan2.2] fix vae patches #12041

merged 1 commit into from
Aug 1, 2025

Conversation

yiyixuxu
Copy link
Collaborator

@yiyixuxu yiyixuxu commented Aug 1, 2025

fix #12034

import torch
import numpy as np
from diffusers import WanImageToVideoPipeline, AutoencoderKLWan, ModularPipeline
from diffusers.utils import export_to_video


model_id = "Wan-AI/Wan2.2-TI2V-5B-Diffusers"
dtype = torch.bfloat16
device = "cuda:1"

vae = AutoencoderKLWan.from_pretrained(model_id, subfolder="vae", torch_dtype=torch.float32)
pipe = WanImageToVideoPipeline.from_pretrained(model_id, vae=vae, torch_dtype=dtype)
pipe.to(device)

# # use default wan image processor to resize and crop the image
image_processor = ModularPipeline.from_pretrained("YiYiXu/WanImageProcessor", trust_remote_code=True)
image = image_processor(
    image="https://cloud.inference.sh/u/4mg21r6ta37mpaz6ktzwtt8krr/01k1g7k73eebnrmzmc6h0bghq6.png",
    max_area=1280*704, output="processed_image")

height, width = image.height, image.width
print(f"height: {height}, width: {width}")
num_frames = 49
num_inference_steps =38
guidance_scale = 5.0

prompt = "morpheus from the matrix offering the choice, include morpheus, on one hand it says \"local\" on the other it says \"cloud\""

negative_prompt = "色调艳丽,过曝,静态,细节模糊不清,字幕,风格,作品,画作,画面,静止,整体发灰,最差质量,低质量,JPEG压缩残留,丑陋的,残缺的,多余的手指,画得不好的手部,画得不好的脸部,畸形的,毁容的,形态畸形的肢体,手指融合,静止不动的画面,杂乱的背景,三条腿,背景人很多,倒着走"
generator = torch.Generator(device=device).manual_seed(42)
output = pipe(
    image=image,
    prompt=prompt,
    negative_prompt=negative_prompt,
    height=height,
    width=width,
    num_frames=num_frames,
    guidance_scale=guidance_scale,
    num_inference_steps=num_inference_steps,
    generator=generator,
    #latents=latents,
).frames[0]
export_to_video(output, "yiyi_test_6_2_output.mp4", fps=24)
yiyi_test_6_2_output.mp4
@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@yiyixuxu yiyixuxu merged commit 58d2b10 into main Aug 1, 2025
14 of 15 checks passed
@yiyixuxu yiyixuxu deleted the fix-wan-vae branch August 1, 2025 09:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
2 participants