Skip to content

Commit cae9478

Browse files
Merge pull request #34 from EtienneDosSantos/dev
Dev
2 parents 42efb0f + be4c541 commit cae9478

File tree

7 files changed

+40
-23
lines changed

7 files changed

+40
-23
lines changed
28.8 KB
Loading
31.2 KB
Loading
420 KB
Loading

docs/CHANGELOG.md

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,14 @@
1+
### 20.03.2024
2+
3+
* Set guidance_scale (decoder) to 1.9 and num_inference_steps to 54 for optimal image quality.
4+
* **Key finding:** Using torch.bfloat16 for the decoder significantly increased model loading speed (3.24x faster) compared to torch.float16. "Other performance metrics remained virtually unchanged, and surprisingly, there was no perceptible difference in image quality (see [Figure 1](https://github.com/EtienneDosSantos/stable-cascade-one-click-installer/blob/dev/assets/dtype_comparison_two_images.jpg)).
5+
* **Charts:** I've created two charts visualizing these results (see [Figure 2](https://github.com/EtienneDosSantos/stable-cascade-one-click-installer/blob/dev/assets/charts/chart_dtype_inference_and_loading_speeds_compared.png), [Figure 3](https://github.com/EtienneDosSantos/stable-cascade-one-click-installer/blob/dev/assets/charts/chart_dtype_VRAM_footprint_compared.png)).
6+
7+
### 19.03.2024
8+
9+
* **[PR #7381:](https://github.com/huggingface/diffusers/pull/7381)**
10+
* Fixed the bug so we can generate multiple images simultaneously – thx [@DN6](https://github.com/DN6)! 🎉
11+
112
### 17.03.2024
213

314
* **[PR #31:](https://github.com/EtienneDosSantos/stable-cascade-one-click-installer/commit/e84010c83daa126b10cecae584cb8a4979689528)**

docs/ROADMAP.md

Lines changed: 19 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -1,25 +1,34 @@
11
### Features to Add:
22

3-
**1. Image Metadata Storage**
3+
**3. Test Decoder Dtype Influence** ✔️
4+
5+
* **`torch.bfloat16` vs. `torch.float16`:**
6+
- [x] VRAM footprint
7+
- [x] Inference speed
8+
- [x] Image quality
9+
10+
**2. Batch Size Fix (>1)** ✔️
11+
12+
* **Goal:** Restore the ability to generate multiple images per prompt.
13+
- [x] Not getting anywhere, opened [issue #7377](https://github.com/huggingface/diffusers/issues/7377) to hopefully get this resolved.
14+
- [x] **Issue Review:** Test provided solution to issue ([PR #7381](https://github.com/huggingface/diffusers/pull/7381))! Amazing work, thx [@DN6](https://github.com/DN6)! 🎉
15+
* **Troubleshooting Steps:**
16+
- [ ] **Error Analysis:** Identify the specific error or unexpected behavior.
17+
- [ ] **Code Review:** Examine logic related to batch size handling.
18+
- [ ] **Dependency Check:** Ensure compatibility between any updated libraries and the batching functionality.
19+
20+
**1. Image Metadata Storage** ✔️
421

522
* **Goal:** Embed essential generation parameters within generated images for reproducibility and analysis.
623
* **Metadata to Include:**
724
- [x] Seed
825
- [x] Number of steps
926
- [x] Model name
1027
- [x] CFG value
11-
- [ ] Sampler
28+
- [x] Sampler
1229
- [x] Prompt
1330

1431
* **Implementation Steps:**
1532
- **Library Selection:** Research image metadata libraries (e.g., ExifWrite, PIL/Pillow).
1633
- **Integration:** Modify image generation code to write metadata.
1734
- **Testing:** Verify metadata is written and readable.
18-
19-
**2. Batch Size Fix (>1)**
20-
21-
* **Goal:** Restore the ability to generate multiple images per prompt.
22-
* **Troubleshooting Steps:**
23-
- [ ] **Error Analysis:** Identify the specific error or unexpected behavior.
24-
- [ ] **Code Review:** Examine logic related to batch size handling.
25-
- [ ] **Dependency Check:** Ensure compatibility between any updated libraries and the batching functionality.

requirements.txt

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -11,7 +11,7 @@
1111

1212
--find-links https://download.pytorch.org/whl/torch_stable.html
1313
accelerate>=0.25.0
14-
diffusers==0.27.0
14+
diffusers==0.27.2
1515
einops>=0.7.0
1616
gradio
1717
kornia>=0.7.0

run.py

Lines changed: 9 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,7 @@
66
# Stability AI Non-Commercial Research Community License Agreement, dated November 28, 2023.
77
# For more information, see https://stability.ai/use-policy.
88

9-
from diffusers import StableCascadeDecoderPipeline, StableCascadePriorPipeline
9+
from diffusers import StableCascadeDecoderPipeline, StableCascadePriorPipeline, StableCascadeUNet
1010
import gradio as gr
1111
import json
1212
import os
@@ -24,9 +24,9 @@
2424
def load_model(model_name):
2525
# Load model from disk every time it's needed
2626
if model_name == "prior":
27-
model = StableCascadePriorPipeline.from_pretrained("stabilityai/stable-cascade-prior", variant="bf16", torch_dtype=dtype).to(device)
27+
model = StableCascadePriorPipeline.from_pretrained("stabilityai/stable-cascade-prior", variant="bf16", torch_dtype=dtype, use_safetensors=True).to(device)
2828
elif model_name == "decoder":
29-
model = StableCascadeDecoderPipeline.from_pretrained("stabilityai/stable-cascade", variant="bf16", torch_dtype=torch.float16).to(device)
29+
model = StableCascadeDecoderPipeline.from_pretrained("stabilityai/stable-cascade", variant="bf16", torch_dtype=dtype, use_safetensors=True).to(device)
3030
else:
3131
raise ValueError(f"Unknown model name: {model_name}")
3232
return model
@@ -79,26 +79,23 @@ def generate_images(prompt, height, width, negative_prompt, guidance_scale, num_
7979
num_images_per_prompt=int(num_images_per_prompt),
8080
generator=generator,
8181
)
82-
del prior # Explicitly delete the model to help with memory management
83-
torch.cuda.empty_cache() # Clear the CUDA cache to free up unused memory
8482

8583
# Load, use, and discard the decoder model
8684
decoder = load_model("decoder")
8785
decoder.enable_model_cpu_offload()
8886
decoder_output = decoder(
89-
image_embeddings=prior_output.image_embeddings.to(torch.float16),
87+
image_embeddings=prior_output.image_embeddings.to(dtype),
9088
prompt=cleaned_prompt,
9189
negative_prompt=negative_prompt,
92-
guidance_scale=0.0,
90+
guidance_scale=1.9, # Guidance scale is enabled by setting guidance_scale > 1
9391
num_inference_steps=calculated_steps_decoder,
9492
output_type="pil",
9593
generator=generator,
9694
).images
97-
del decoder # Explicitly delete the model to help with memory management
98-
torch.cuda.empty_cache() # Clear the CUDA cache to free up unused memory
99-
95+
10096
metadata_embedded = {
10197
"parameters": "Stable Cascade",
98+
"scheduler": "DDPMWuerstchenScheduler",
10299
"prompt": cleaned_prompt,
103100
"negative_prompt": negative_prompt,
104101
"width": int(width),
@@ -190,8 +187,8 @@ def configure_ui():
190187
height = gr.Slider(minimum=512, maximum=2048, step=1, value=1024, label="Image Height")
191188
with gr.Column():
192189
# components in central column
193-
num_inference_steps = gr.Slider(minimum=1, maximum=150, step=1, value=30, label="Steps")
194-
num_images_per_prompt = gr.Number(label="Number of Images per Prompt (Currently, the system can only generate one image at a time. Please leave the 'Images per Prompt' setting at 1 until this issue is fixed.)", value=1)
190+
num_inference_steps = gr.Slider(minimum=1, maximum=150, step=1, value=54, label="Steps")
191+
num_images_per_prompt = gr.Number(label="Number of Images per Prompt", value=2)
195192
with gr.Column():
196193
# components in right column
197194
guidance_scale = gr.Slider(minimum=1, maximum=20, step=0.5, value=4.0, label="Guidance Scale")

0 commit comments

Comments
 (0)