Closed
Description
Describe the bug
This .to() cast on the text encoder:
diffusers/src/diffusers/loaders/lora_base.py
Line 421 in 9836f0e
Is invalid when working with an SD1.5 / SDXL pipeline that has a bitsandbytes quantization config used on the text encoder
perhaps something like this would fix?
if is_bitsandbytes_available():
quant, is_4bit, _ = _check_bnb_status(text_encoder)
else:
quant, is_4bit = False, False
if not quant:
text_encoder.to(device=text_encoder.device, dtype=text_encoder.dtype)
elif is_4bit:
text_encoder.to(device=text_encoder.device)
This problem does not seem to affect flux / sd3, so I am not sure how this would affect other pipelines?
Reproduction
import torch
import transformers
import diffusers
import diffusers.quantizers.quantization_config as _qc
text_encoder = transformers.CLIPTextModel.from_pretrained(
'stabilityai/stable-diffusion-xl-base-1.0', subfolder='text_encoder', variant='fp16',
torch_dtype=torch.float16, quantization_config=_qc.BitsAndBytesConfig(load_in_8bit=True))
pipeline = diffusers.StableDiffusionXLPipeline.from_pretrained(
'stabilityai/stable-diffusion-xl-base-1.0',
variant='fp16',
torch_dtype=torch.float16,
text_encoder=text_encoder
)
pipeline.load_lora_weights('Norod78/sdxl-emoji-lora')
pipeline.to('cuda')
pipeline(prompt='test')
Logs
REDACT\diffusers\venv\Scripts\python.exe REDACT\diffusers\test.py
WARNING:torchao.kernel.intmm:Warning: Detected no triton, on systems without Triton certain kernels will not work
`low_cpu_mem_usage` was None, now default to True since model is quantized.
Loading pipeline components...: 100%|██████████| 7/7 [00:00<00:00, 15.86it/s]
Traceback (most recent call last):
File "REDACT\diffusers\test.py", line 18, in <module>
pipeline.load_lora_weights('Norod78/sdxl-emoji-lora')
File "REDACT\diffusers\src\diffusers\loaders\lora_pipeline.py", line 657, in load_lora_weights
self.load_lora_into_text_encoder(
File "REDACT\diffusers\src\diffusers\loaders\lora_pipeline.py", line 894, in load_lora_into_text_encoder
_load_lora_into_text_encoder(
File "REDACT\diffusers\src\diffusers\loaders\lora_base.py", line 430, in _load_lora_into_text_encoder
text_encoder.to(device=text_encoder.device, dtype=text_encoder.dtype)
File "REDACT\diffusers\venv\Lib\site-packages\transformers\modeling_utils.py", line 3089, in to
raise ValueError(
ValueError: You cannot cast a bitsandbytes model in a new `dtype`. Make sure to load the model using `from_pretrained` using the desired `dtype` by passing the correct `torch_dtype` argument.
Process finished with exit code 1
System Info
diffusers == 0.34.0.dev0