Allow passing different prompts to each `text_encoder` on `stable_diffusion_xl` pipelines #4156

apolinario · 2023-07-19T13:42:32Z

Fixes #4004 (issue)

This is a draft PR with an approach for allowing to pass prompt_2 and negative_prompt_2 to SDXL. If this approach sounds good, I will then do it for all SDXL pipelines and add necessary documentation and tests.

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you read the contributor guideline?
Did you read our philosophy doc (important for complex PRs)?
Was this discussed/approved via a Github issue or the forum? Please add a link to it if that's the case.
Did you make sure to update the documentation with your changes? Here are the
documentation guidelines, and
here are tips on formatting docstrings.
Did you write any new necessary tests?

Who can review?

@bghira @sayakpaul

HuggingFaceDocBuilderDev · 2023-07-19T13:49:38Z

The documentation is not available anymore as the PR was closed or merged.

bghira

looks like how i was going to do that. thank you for tackling that one.

patrickvonplaten

Design looks great, just added lots of comments as I think it's slightly more readable to just add prompts to zip instead of starting to index.

If you run make fix-copies all your changes should also be applied to the other SD-XL pipelines and then we only would have to add the names to the call function and some docstring

Finally could you also add a test here:

diffusers/tests/pipelines/stable_diffusion_xl/test_stable_diffusion_xl.py

Line 225 in 6b1abba

which can be very similar to this one:

diffusers/tests/pipelines/stable_diffusion_xl/test_stable_diffusion_xl.py

Line 198 in 6b1abba

def test_stable_diffusion_xl_offloads(self):

But instead of testing different offloads, we test that:
a) providing the same prompt to "prompt" and "prompt_2" and "negative_prompt" and "negative_prompt_2" gives the same results as just providing one prompt
b) that providing different prompts to each gives different results than just providing one

If you want we could also add a quick bullet point to the tips section here:

diffusers/docs/source/en/api/pipelines/stable_diffusion/stable_diffusion_xl.mdx

Line 24 in 6b1abba

    
           - Stable Diffusion XL output image can be improved by making use of a refiner as shown below.

saying that one can provide two prompts

src/diffusers/pipelines/stable_diffusion_xl/pipeline_stable_diffusion_xl.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

docs/source/en/api/pipelines/stable_diffusion/stable_diffusion_xl.mdx

sayakpaul · 2023-07-20T03:25:17Z

docs/source/en/api/pipelines/stable_diffusion/stable_diffusion_xl.mdx

+
+### Passing different prompts to each text-encoder
+
+Stable Diffusion XL as trained on two text-encoders. The default behavior is to pass the same prompt to each. But it is possible to pass a different prompt for each text-encoder, as [some users](https://github.com/huggingface/diffusers/issues/4004#issuecomment-1627764201) alledge it can boost quality.


Suggested change

Stable Diffusion XL as trained on two text-encoders. The default behavior is to pass the same prompt to each. But it is possible to pass a different prompt for each text-encoder, as [some users](https://github.com/huggingface/diffusers/issues/4004#issuecomment-1627764201) alledge it can boost quality.

Stable Diffusion XL was trained on two text encoders. The default behavior is to pass the same prompt to each. But it is possible to pass a different prompt for each text-encoder, as [some users](https://github.com/huggingface/diffusers/issues/4004#issuecomment-1627764201) noted that it can boost quality.

docs/source/en/api/pipelines/stable_diffusion/stable_diffusion_xl.mdx

sayakpaul · 2023-07-20T03:28:27Z

src/diffusers/pipelines/alt_diffusion/pipeline_alt_diffusion.py

+            "The decode_latents method is deprecated and will be removed in a future version. Please"
+            " use VaeImageProcessor instead",


Could this be happening for a different version of black installed on the local machine? This should not have happened.

This didn't happen with black but rather on make fix-copies 🤔

I fixed it now

sayakpaul · 2023-07-20T03:28:33Z

src/diffusers/pipelines/alt_diffusion/pipeline_alt_diffusion_img2img.py

+            "The decode_latents method is deprecated and will be removed in a future version. Please"
+            " use VaeImageProcessor instead",


sayakpaul · 2023-07-20T03:35:10Z

Looking good to me. I agree with Patrick's suggestions, especially:

Ensuring we add "# Copied from ..." in the derivative pipelines (inpainting, ControlNet, etc.) for encode_prompt(). I recommend fixing encode_prompt() only in the main SDXL pipeline code. Then if we run make fix-copies, everything should be taken care of. We're probably already doing this but just wanted to double check.
Let's incorporate the sanity test proposed by Patrick and ensure the ones you added are passing.

Let's ship this :)

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

…face/diffusers into diff_prompts_to_encoders

patrickvonplaten · 2023-07-20T15:58:35Z

Let me know if you want a final review @apolinario :-)

apolinario · 2023-07-20T17:18:38Z

Now yes :) @patrickvonplaten

Edit: tests are failing for image2image and in-painting pipelines. One of my tests should fail if the image is the same with or without a prompt_2. For text2image it passes, but for inpainting and img2img it fails. However perceptually out of the test setup it works. I wonder how to make the mask/image2image tests to generate enough difference

here's the perceptual test:

(inpainting using prompt_2)

(inpainting without using prompt_2)

…fusion_xl` pipelines (huggingface#4156) * sdxl prompt2 * Improve checks * doc linting * whoops * remove cat * Apply suggestions from code review Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Add other pipelines and tests * Add multi-prompting to docs * doc and copies check * Fix copied froms * Apply suggestions from code review Co-authored-by: Sayak Paul <spsayakpaul@gmail.com> * Bring back the original code for unrelated files * Fix tests * Fix img2img * Fix all * fix --------- Co-authored-by: multimodalart <joaopaulo.passos+multimodal@gmail.com> Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

multimodalart added 2 commits July 19, 2023 12:29

sdxl prompt2

b8eba79

Improve checks

06a6078

apolinario changed the title ~~Allow passing different prompts to each text_encoder on Stable Diffusion XL~~ Jul 19, 2023

apolinario changed the title ~~Allow passing different prompts to each text_encoder on Stable Diffusion XL~~ Jul 19, 2023

multimodalart and others added 3 commits July 19, 2023 14:58

doc linting

4ed4dcf

whoops

6d92ad7

remove cat

4a28dfd

bghira approved these changes Jul 19, 2023

View reviewed changes

patrickvonplaten reviewed Jul 19, 2023

View reviewed changes

apolinario and others added 4 commits July 19, 2023 22:23

Apply suggestions from code review

bee44cc

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

Add other pipelines and tests

13de8a9

Add multi-prompting to docs

d2b7aa1

doc and copies check

decc59b