[Core] Add a level 3 sleep/wake_up that offloads tensors to disk #14678

manoelmarques · 2025-03-12T12:26:43Z

Add Level 3 sleep mode that will offload the model weights to disk and discard the kv cache.

The model weights are not backed up in CPU memory and the content of kv cache is forgotten.

Level 3 sleep helps use minimum CPU memory and loads efficiently from disk when woken up.

github-actions · 2025-03-12T12:26:54Z

👋 Hi! Thank you for contributing to the vLLM project.

💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels.

Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run fastcheck CI which starts running only a small and essential subset of CI tests to quickly catch errors. You can run other CI tests on top of those by going to your fastcheck build on Buildkite UI (linked in the PR checks section) and unblock them. If you do not have permission to unblock, ping simon-mo or khluu to add you in our Buildkite org.

Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging.

To run CI, PR reviewers can either: Add ready label to the PR or enable auto-merge.

🚀

youkaichao

I'm hesitant to do it. There are too many things to consider, in terms of disk, like the location of the disk to use, whether or not saving all tensors in one file ...

I would recommend you rewrite the sleep (level=1) logic for your use case, and keep it your own. I don't think this complexity is a good fit for the upstream.

mergify · 2025-04-01T06:03:42Z

This pull request has merge conflicts that must be resolved before it can be
merged. Please rebase the PR, @manoelmarques.

https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

mergify · 2025-08-26T12:10:38Z

This pull request has merge conflicts that must be resolved before it can be
merged. Please rebase the PR, @manoelmarques.

https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

mergify · 2025-09-12T14:56:45Z

This pull request has merge conflicts that must be resolved before it can be
merged. Please rebase the PR, @manoelmarques.

https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

Co-authored-by: aavarghese <[email protected]> Co-authored-by: manoelmarques <[email protected]> Signed-off-by: Manoel Marques <[email protected]>

manoelmarques requested review from DarkLight1337, WoosukKwon, alexm-redhat, comaniac, njhill, robertgshaw2-redhat, simon-mo, youkaichao, ywang96 and zhuohan123 as code owners March 12, 2025 12:26

mergify bot added frontend v1 labels Mar 12, 2025

manoelmarques force-pushed the sleepwake branch 2 times, most recently from a627b18 to 348c817 Compare March 12, 2025 12:44

manoelmarques marked this pull request as draft March 12, 2025 13:29

manoelmarques force-pushed the sleepwake branch 8 times, most recently from 8d9863d to 0b7192a Compare March 18, 2025 19:12

youkaichao reviewed Mar 22, 2025

View reviewed changes

mergify bot added the needs-rebase label Apr 1, 2025

manoelmarques force-pushed the sleepwake branch from 0b7192a to 8c9e0e8 Compare April 9, 2025 13:24

mergify bot removed the needs-rebase label Apr 9, 2025

manoelmarques force-pushed the sleepwake branch from 8c9e0e8 to 46b76bf Compare April 10, 2025 13:03

mergify bot added the needs-rebase label Aug 26, 2025

manoelmarques force-pushed the sleepwake branch from 351534b to f297263 Compare August 28, 2025 13:43

mergify bot removed the needs-rebase label Aug 28, 2025

manoelmarques force-pushed the sleepwake branch 4 times, most recently from 503389d to bdca49c Compare August 29, 2025 13:35

manoelmarques force-pushed the sleepwake branch 3 times, most recently from d4d0b1b to 3e6d848 Compare September 9, 2025 16:58

mergify bot added the needs-rebase label Sep 12, 2025

manoelmarques force-pushed the sleepwake branch from 3e6d848 to bf5cc17 Compare September 15, 2025 13:01

mergify bot removed the needs-rebase label Sep 15, 2025

manoelmarques force-pushed the sleepwake branch 4 times, most recently from 851c74d to 96889b7 Compare September 23, 2025 13:03

manoelmarques force-pushed the sleepwake branch 2 times, most recently from ac52f98 to 384a568 Compare October 1, 2025 13:09

manoelmarques force-pushed the sleepwake branch 2 times, most recently from 7738af8 to a82562e Compare October 9, 2025 13:13

manoelmarques force-pushed the sleepwake branch 3 times, most recently from 2bdabae to 3eed577 Compare October 13, 2025 15:00

manoelmarques force-pushed the sleepwake branch from 3eed577 to 7a92636 Compare November 5, 2025 14:19

manoelmarques force-pushed the sleepwake branch from 7a92636 to 4f74d64 Compare November 18, 2025 13:49

Add a level 3 sleep/wake_up that offloads tensors to disk

b9e95d1

Co-authored-by: aavarghese <[email protected]> Co-authored-by: manoelmarques <[email protected]> Signed-off-by: Manoel Marques <[email protected]>

manoelmarques force-pushed the sleepwake branch from 4f74d64 to b9e95d1 Compare December 3, 2025 14:07

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[Core] Add a level 3 sleep/wake_up that offloads tensors to disk #14678

[Core] Add a level 3 sleep/wake_up that offloads tensors to disk #14678

manoelmarques commented Mar 12, 2025 •

edited by github-actions bot

Loading

Uh oh!

github-actions bot commented Mar 12, 2025

Uh oh!

youkaichao left a comment

Uh oh!

mergify bot commented Apr 1, 2025

Uh oh!

mergify bot commented Aug 26, 2025

Uh oh!

mergify bot commented Sep 12, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

[Core] Add a level 3 sleep/wake_up that offloads tensors to disk #14678

Are you sure you want to change the base?

[Core] Add a level 3 sleep/wake_up that offloads tensors to disk #14678

Conversation

manoelmarques commented Mar 12, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Mar 12, 2025

Uh oh!

youkaichao left a comment

Choose a reason for hiding this comment

Uh oh!

mergify bot commented Apr 1, 2025

Uh oh!

mergify bot commented Aug 26, 2025

Uh oh!

mergify bot commented Sep 12, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

manoelmarques commented Mar 12, 2025 •

edited by github-actions bot

Loading