Skip to content

Commit 4086acf

Browse files
authored
Fix on-load VRAM OOM (#11144)
slow down the CPU on model load to not run ahead. This fixes a VRAM on flux 2 load. I went to try and debug this with the memory trace pickles, which needs --disable-cuda-malloc which made the bug go away. So I tried this synchronize and it worked. The has some very complex interactions with the cuda malloc async and I dont have solid theory on this one yet. Still debugging but this gets us over the OOM for the moment.
1 parent 50ca97e commit 4086acf

File tree

1 file changed

+2
-0
lines changed

1 file changed

+2
-0
lines changed

comfy/model_patcher.py

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -762,6 +762,8 @@ def load(self, device_to=None, lowvram_model_memory=0, force_patch_weights=False
762762
key = "{}.{}".format(n, param)
763763
self.unpin_weight(key)
764764
self.patch_weight_to_device(key, device_to=device_to)
765+
if comfy.model_management.is_device_cuda(device_to):
766+
torch.cuda.synchronize()
765767

766768
logging.debug("lowvram: loaded module regularly {} {}".format(n, m))
767769
m.comfy_patched_weights = True

0 commit comments

Comments
 (0)