mp: use look-ahead actuals for stream offload VRAM calculation (fixes unwanted TE full offload) #11096

rattus128 · 2025-12-04T04:15:14Z

TIL that the WAN TE has a 2GB weight followed by 16MB as the next size down. This means that team 8GB VRAM would fully offload the TE in async offload mode as it just multiplied this giant size by the num streams.

Do the more complex logic of summing up the upcoming to-load weight sizes to avoid triple counting this massive weight.

partial unload does the converse of recording the NS most recent unloads as they go.

This fixes a 2x side reports here: #11081
(This is not the OPs issue)

Example test conditions:
RTX3060 --reserve-vram 5.5 (emulates 8GB with some extra non incidental VRAM usage - matching user number)

Before:

Requested to load WanTEModel
loaded partially; 5334.55 MB usable, 0.00 MB loaded, 6419.09 MB offloaded, 6009.00 MB buffer reserved, lowvram patches: 0

After:

Requested to load WanTEModel
loaded partially; 5334.55 MB usable, 5283.48 MB loaded, 1136.00 MB offloaded, 48.00 MB buffer reserved, lowvram patches: 0

TIL that the WAN TE has a 2GB weight followed by 16MB as the next size down. This means that team 8GB VRAM would fully offload the TE in async offload mode as it just multiplied this giant size my the num streams. Do the more complex logic of summing up the upcoming to-load weight sizes to avoid triple counting this massive weight. partial unload does the converse of recording the NS most recent unloads as they go.

rattus128 requested review from Kosinkadink, comfyanonymous and guill as code owners December 4, 2025 04:15

comfyanonymous merged commit 6be85c7 into comfyanonymous:master Dec 4, 2025
10 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

mp: use look-ahead actuals for stream offload VRAM calculation (fixes unwanted TE full offload) #11096

mp: use look-ahead actuals for stream offload VRAM calculation (fixes unwanted TE full offload) #11096

Uh oh!

rattus128 commented Dec 4, 2025 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

mp: use look-ahead actuals for stream offload VRAM calculation (fixes unwanted TE full offload) #11096

mp: use look-ahead actuals for stream offload VRAM calculation (fixes unwanted TE full offload) #11096

Uh oh!

Conversation

rattus128 commented Dec 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

rattus128 commented Dec 4, 2025 •

edited

Loading