Skip to content

cpu_offload vRAM memory consumption large than 4GB #1934

Closed
@Sanster

Description

@Sanster

Describe the bug

I am using the code from https://huggingface.co/docs/diffusers/optimization/fp16#offloading-to-cpu-with-accelerate-for-memory-savings to test cpu_offload, but the vRAM memory consumption is large than 4GB

GPU cpu_offload enabled vRAM cost
1080 Yes 4539MB
1080 No 5101MB
TITAN RTX Yes 5134MB
TITAN RTX No 5668MB

Reproduction

I am using the code from https://huggingface.co/docs/diffusers/optimization/fp16#offloading-to-cpu-with-accelerate-for-memory-savings

import torch
from diffusers import StableDiffusionPipeline

pipe = StableDiffusionPipeline.from_pretrained(
    "runwayml/stable-diffusion-v1-5",
    
    torch_dtype=torch.float16,
)
pipe = pipe.to("cuda")

prompt = "a photo of an astronaut riding a horse on mars"
pipe.enable_sequential_cpu_offload()
image = pipe(prompt).images[0]

Logs

No response

System Info

test on 1080/TITAN RTX

  • diffusers version: 0.11.1
  • accelerate version: 0.15.0
  • Platform: Linux-4.15.0-142-generic-x86_64-with-glibc2.29
  • Python version: 3.8.10
  • PyTorch version (GPU?): 1.10.1+cu111 (True)
  • Huggingface_hub version: 0.11.1
  • Transformers version: 4.25.1
  • Using GPU in script?: Yes
  • Using distributed or parallel set-up in script?: No

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingstaleIssues that haven't received updates

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions