-
Notifications
You must be signed in to change notification settings - Fork 10.6k
Feature/qwen eligen support #10473
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Feature/qwen eligen support #10473
Conversation
|
Hey @nolan4 thank you so much for this amazing work! I am trying to test this out but not able to run it, I am using this PR and using the Eligen Lora provided by Diffsynth Studio and using this workflow, please check. |
|
hey @nolan4 may you please reply? |
|
Hi @krigeta — here’s a screenshot of my workflow, which is based on the Qwen text-to-image template. I’m also using the same EliGen LoRA from Diffsynth Studio that you linked. Hope this helps you get it running! |
|
Hey @nolan4 thank you so much for this, what if this branch is not merged then is it possible to create a custom node of this? and yeah i will test this and share the results for sure. |
|
hey @nolan4 it is not working in my case, please check. |
|
test locally and works, comfy will do a code review to see if anything else needs to be changed! |
|
Hey @nolan4, I guess this implementation is missing the colour-coded masks that help the lora to differentiate between the regions when they overlap. Please look into it. |
|
btw something similar was implemented in the Inspire Pack called "regional conditioning by color masks" in case you need inspiration or code |
comfy-cli --workspace ./ComfyUI_eligen install --pr "#10473"
comfy-cli --workspace ./ComfyUI_eligen launch |
|
Hey @geroldmeisinger , thank you so much for the share, is there any other social media where we can chat? |
|
here are the original masks https://www.modelscope.cn/datasets/DiffSynth-Studio/examples_in_diffsynth/files |
These are the ones I got to know about Eligen. In my case, the overlapped masking is not working properly, as I want to create two characters in front of each other, and the view is from the back. Or I would say I use one entity for a character and another for another character, and with ControlNet? Does it work? And if possible, may you share your Discord or other social app? |
Kosinkadink
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Apologies for the delay, I got comfy's review:
- The
eligen_attention_maskis already passed as a parameter into the block as the attention_mask - is the use of transformer_options to store that necessary? it seems redundant. if that is the case, could you remove it from being inserted into the transformer_options? Let us know if there is something that prevents this from working! - For logging, our pattern is to import logging and then just do
logging.info(...),logging.debug(...)instead of initializing a logger. Could you change the code here to do that too? You can take a look at other ComfyUI code for clarity. - Could you restore the removed whitespace from
model_base.py?
I opened up comments in the relevant parts of code!
03976ec to
9792606
Compare
|
Latest version supports 8 entities! Below is a screenshot for the referenced DiffSynth example.
|
|
awesome! |
|
There will be a native growing input type soon - any custom javascript to do this will not be allowed in core outside of a 'general' implementation! That will allow an indefinite amount of inputs into the node. |
















Pull Request: Add Entity-Level Image Generation (EliGen) for Qwen Image
Summary
This update implements Entity-Level Image Generation (EliGen) for the Qwen Image model, allowing region-specific prompts through spatial masks. The feature provides fine-grained control over image generation by applying separate attention masks for each entity.
Key Features
• Spatial attention masking with isolated entity prompts
• Automatic mask resizing to match latent dimensions
• RoPE embedding implementation aligned with DiffSynth Studio
• Support for batch_size > 1
• Backward compatible with standard Qwen Image workflows