Skip to content

Allow passing different prompts to each text_encoder on stable_diffusion_xl pipelines #4156

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 18 commits into from
Jul 21, 2023

Conversation

apolinario
Copy link
Collaborator

@apolinario apolinario commented Jul 19, 2023

Fixes #4004 (issue)

This is a draft PR with an approach for allowing to pass prompt_2 and negative_prompt_2 to SDXL. If this approach sounds good, I will then do it for all SDXL pipelines and add necessary documentation and tests.

Who can review?

@bghira @sayakpaul

@apolinario apolinario changed the title Allow passing different prompts to each text_encoder on Stable Diffusion XL Jul 19, 2023
@HuggingFaceDocBuilderDev
Copy link

HuggingFaceDocBuilderDev commented Jul 19, 2023

The documentation is not available anymore as the PR was closed or merged.

@apolinario apolinario changed the title Allow passing different prompts to each text_encoder on Stable Diffusion XL Jul 19, 2023
multimodalart and others added 3 commits July 19, 2023 14:58
Copy link
Contributor

@bghira bghira left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks like how i was going to do that. thank you for tackling that one.

Copy link
Contributor

@patrickvonplaten patrickvonplaten left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Design looks great, just added lots of comments as I think it's slightly more readable to just add prompts to zip instead of starting to index.

If you run make fix-copies all your changes should also be applied to the other SD-XL pipelines and then we only would have to add the names to the call function and some docstring

Finally could you also add a test here:

which can be very similar to this one:

def test_stable_diffusion_xl_offloads(self):

But instead of testing different offloads, we test that:
a) providing the same prompt to "prompt" and "prompt_2" and "negative_prompt" and "negative_prompt_2" gives the same results as just providing one prompt
b) that providing different prompts to each gives different results than just providing one

If you want we could also add a quick bullet point to the tips section here:

- Stable Diffusion XL output image can be improved by making use of a refiner as shown below.
saying that one can provide two prompts


### Passing different prompts to each text-encoder

Stable Diffusion XL as trained on two text-encoders. The default behavior is to pass the same prompt to each. But it is possible to pass a different prompt for each text-encoder, as [some users](https://github.com/huggingface/diffusers/issues/4004#issuecomment-1627764201) alledge it can boost quality.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Stable Diffusion XL as trained on two text-encoders. The default behavior is to pass the same prompt to each. But it is possible to pass a different prompt for each text-encoder, as [some users](https://github.com/huggingface/diffusers/issues/4004#issuecomment-1627764201) alledge it can boost quality.
Stable Diffusion XL was trained on two text encoders. The default behavior is to pass the same prompt to each. But it is possible to pass a different prompt for each text-encoder, as [some users](https://github.com/huggingface/diffusers/issues/4004#issuecomment-1627764201) noted that it can boost quality.
Comment on lines 427 to 428
"The decode_latents method is deprecated and will be removed in a future version. Please"
" use VaeImageProcessor instead",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could this be happening for a different version of black installed on the local machine? This should not have happened.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This didn't happen with black but rather on make fix-copies 🤔

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I fixed it now

Comment on lines 425 to 426
"The decode_latents method is deprecated and will be removed in a future version. Please"
" use VaeImageProcessor instead",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same.

@sayakpaul
Copy link
Member

Looking good to me. I agree with Patrick's suggestions, especially:

  • Ensuring we add "# Copied from ..." in the derivative pipelines (inpainting, ControlNet, etc.) for encode_prompt(). I recommend fixing encode_prompt() only in the main SDXL pipeline code. Then if we run make fix-copies, everything should be taken care of. We're probably already doing this but just wanted to double check.
  • Let's incorporate the sanity test proposed by Patrick and ensure the ones you added are passing.

Let's ship this :)

@patrickvonplaten
Copy link
Contributor

Let me know if you want a final review @apolinario :-)

@apolinario apolinario marked this pull request as ready for review July 20, 2023 17:17
@apolinario
Copy link
Collaborator Author

apolinario commented Jul 20, 2023

Now yes :) @patrickvonplaten

Edit: tests are failing for image2image and in-painting pipelines. One of my tests should fail if the image is the same with or without a prompt_2. For text2image it passes, but for inpainting and img2img it fails. However perceptually out of the test setup it works. I wonder how to make the mask/image2image tests to generate enough difference

here's the perceptual test:
image
(inpainting using prompt_2)

image
(inpainting without using prompt_2)

@patrickvonplaten patrickvonplaten merged commit aed30df into main Jul 21, 2023
@patrickvonplaten patrickvonplaten deleted the diff_prompts_to_encoders branch July 21, 2023 12:57
orpatashnik pushed a commit to orpatashnik/diffusers that referenced this pull request Aug 1, 2023
…fusion_xl` pipelines (huggingface#4156)

* sdxl prompt2

* Improve checks

* doc linting

* whoops

* remove cat

* Apply suggestions from code review

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Add other pipelines and tests

* Add multi-prompting to docs

* doc and copies check

* Fix copied froms

* Apply suggestions from code review

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* Bring back the original code for unrelated files

* Fix tests

* Fix img2img

* Fix all

* fix

---------

Co-authored-by: multimodalart <joaopaulo.passos+multimodal@gmail.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
orpatashnik pushed a commit to orpatashnik/diffusers that referenced this pull request Aug 1, 2023
…fusion_xl` pipelines (huggingface#4156)

* sdxl prompt2

* Improve checks

* doc linting

* whoops

* remove cat

* Apply suggestions from code review

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Add other pipelines and tests

* Add multi-prompting to docs

* doc and copies check

* Fix copied froms

* Apply suggestions from code review

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* Bring back the original code for unrelated files

* Fix tests

* Fix img2img

* Fix all

* fix

---------

Co-authored-by: multimodalart <joaopaulo.passos+multimodal@gmail.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
orpatashnik pushed a commit to orpatashnik/diffusers that referenced this pull request Aug 1, 2023
…fusion_xl` pipelines (huggingface#4156)

* sdxl prompt2

* Improve checks

* doc linting

* whoops

* remove cat

* Apply suggestions from code review

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Add other pipelines and tests

* Add multi-prompting to docs

* doc and copies check

* Fix copied froms

* Apply suggestions from code review

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* Bring back the original code for unrelated files

* Fix tests

* Fix img2img

* Fix all

* fix

---------

Co-authored-by: multimodalart <joaopaulo.passos+multimodal@gmail.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
yoonseokjin pushed a commit to yoonseokjin/diffusers that referenced this pull request Dec 25, 2023
…fusion_xl` pipelines (huggingface#4156)

* sdxl prompt2

* Improve checks

* doc linting

* whoops

* remove cat

* Apply suggestions from code review

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Add other pipelines and tests

* Add multi-prompting to docs

* doc and copies check

* Fix copied froms

* Apply suggestions from code review

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* Bring back the original code for unrelated files

* Fix tests

* Fix img2img

* Fix all

* fix

---------

Co-authored-by: multimodalart <joaopaulo.passos+multimodal@gmail.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
AmericanPresidentJimmyCarter pushed a commit to AmericanPresidentJimmyCarter/diffusers that referenced this pull request Apr 26, 2024
…fusion_xl` pipelines (huggingface#4156)

* sdxl prompt2

* Improve checks

* doc linting

* whoops

* remove cat

* Apply suggestions from code review

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Add other pipelines and tests

* Add multi-prompting to docs

* doc and copies check

* Fix copied froms

* Apply suggestions from code review

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* Bring back the original code for unrelated files

* Fix tests

* Fix img2img

* Fix all

* fix

---------

Co-authored-by: multimodalart <joaopaulo.passos+multimodal@gmail.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
5 participants