[Bugfix] fix --scheduling-policy=priority & n>1 crashes engine #29764

chaunceyjiang · 2025-12-01T03:05:20Z

Purpose

test

vllm serve /home/qwen3-8b --scheduling-policy=priority

from openai import OpenAI

client = OpenAI(base_url="http://localhost:8000/v1", api_key="dummy")

res = client.chat.completions.create(
    model=client.models.list().data[0].id,
    messages=[{"role": "user", "content": "What is the meaning of life?"}],
    n=2,
)

print(res)

before

(EngineCore_DP0 pid=28688)   File "/home/jovyan/vllm/vllm/v1/engine/core.py", line 295, in add_request
(EngineCore_DP0 pid=28688)     self.scheduler.add_request(request)
(EngineCore_DP0 pid=28688)   File "/home/jovyan/vllm/vllm/v1/core/sched/scheduler.py", line 1263, in add_request
(EngineCore_DP0 pid=28688)     self.waiting.add_request(request)
(EngineCore_DP0 pid=28688)   File "/home/jovyan/vllm/vllm/v1/core/sched/request_queue.py", line 152, in add_request
(EngineCore_DP0 pid=28688)     heapq.heappush(self._heap, (request.priority, request.arrival_time, request))
(EngineCore_DP0 pid=28688) TypeError: '<' not supported between instances of 'Request' and 'Request'

this pr

(APIServer pid=31017) INFO:     127.0.0.1:39200 - "GET /v1/models HTTP/1.1" 200 OK
(APIServer pid=31017) INFO 12-01 03:22:20 [chat_utils.py:574] Detected the chat template content format to be 'string'. You can set `--chat-template-content-format` to override this.
(APIServer pid=31017) INFO 12-01 03:22:27 [loggers.py:236] Engine 000: Avg prompt throughput: 2.9 tokens/s, Avg generation throughput: 117.8 tokens/s, Running: 2 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.3%, Prefix cache hit rate: 0.0%
(APIServer pid=31017) INFO:     127.0.0.1:39200 - "POST /v1/chat/completions HTTP/1.1" 200 OK

Test Plan

see e2e

Test Result

Essential Elements of an Effective PR Description Checklist

The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
The test plan, such as providing test command.
The test results, such as pasting the results comparison before and after, or e2e results
(Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
(Optional) Release notes update. If your change is user facing, please update the release notes draft in the Google Doc.

Signed-off-by: chaunceyjiang <[email protected]>

chatgpt-codex-connector · 2025-12-01T03:37:24Z

Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits.

chaunceyjiang · 2025-12-01T08:04:04Z

/cc @njhill @ApostaC PTAL.

njhill

Thanks @chaunceyjiang! I'm wondering whether it would be better to just add id(request) into the tuple in the priority request queue? I'm not sure about baking specific ordering logic into the Request type itself (and it also kind of duplicates the ordering defined by the tuples).

chaunceyjiang · 2025-12-02T01:08:42Z

I'm wondering whether it would be better to just add id(request) into the tuple in the priority request queue?

@njhill Actually, this problem is solved by using the request_id for sorting. When n=2, the request_id format is 0_AAaabb, 1_AAaabb. Using id(request) is just a fallback measure.

njhill · 2025-12-02T08:01:20Z

@chaunceyjiang I mean including one or the other of these in the tuple in the priority queue heap rather than implementing __lt__ in the request.

Signed-off-by: chaunceyjiang <[email protected]>

chaunceyjiang · 2025-12-02T08:12:23Z

I mean including one or the other of these in the tuple in the priority queue heap rather than implementing lt in the request.

@njhill Done. PTAL.

Signed-off-by: chaunceyjiang <[email protected]>

njhill · 2025-12-02T17:17:28Z

@chaunceyjiang apologies, after seeing the changes and thinking some more, maybe your original change to make the requests comparable would be better.

Since priority and arrival time are properties of the request itself, I think this does make sense as a canonical "default" ordering after all. We can then also simplify the priority heap to contain just the requests rather than wrapping them in tuples.

If it's too late in the day for you to update this, I can make the change quickly so that we can still include this in the release. If you don't respond here soon I'll do that!

Signed-off-by: Nick Hill <[email protected]>

njhill

Thanks again @chaunceyjiang

Signed-off-by: chaunceyjiang <[email protected]> Signed-off-by: Nick Hill <[email protected]> Co-authored-by: Nick Hill <[email protected]> (cherry picked from commit 0a9caca)

…project#29764) Signed-off-by: chaunceyjiang <[email protected]> Signed-off-by: Nick Hill <[email protected]> Co-authored-by: Nick Hill <[email protected]> Signed-off-by: Xingyu Liu <[email protected]>

[Bugfix] fix --scheduling-policy=priority & n>1 crashes engine

e387ec2

Signed-off-by: chaunceyjiang <[email protected]>

mergify bot added the v1 label Dec 1, 2025

chaunceyjiang added 2 commits December 1, 2025 03:22

[Bugfix] fix --scheduling-policy=priority & n>1 crashes engine

e00923c

Signed-off-by: chaunceyjiang <[email protected]>

[Bugfix] fix --scheduling-policy=priority & n>1 crashes engine

3b1f1b1

Signed-off-by: chaunceyjiang <[email protected]>

chaunceyjiang marked this pull request as ready for review December 1, 2025 03:37

chaunceyjiang requested review from ApostaC, WoosukKwon, alexm-redhat, heheda12345, njhill, robertgshaw2-redhat and ywang96 as code owners December 1, 2025 03:37

chaunceyjiang added the ready ONLY add when PR is ready to merge/full CI is needed label Dec 1, 2025

njhill reviewed Dec 1, 2025

View reviewed changes

njhill added the bug Something isn't working label Dec 1, 2025

njhill added this to the v0.12.0 milestone Dec 1, 2025

[Bugfix] fix --scheduling-policy=priority & n>1 crashes engine

bd32a78

Signed-off-by: chaunceyjiang <[email protected]>

chaunceyjiang added 2 commits December 2, 2025 08:32

[Bugfix] fix --scheduling-policy=priority & n>1 crashes engine

fa2a8d9

Signed-off-by: chaunceyjiang <[email protected]>

[Bugfix] fix --scheduling-policy=priority & n>1 crashes engine

7728da6

Signed-off-by: chaunceyjiang <[email protected]>

chaunceyjiang requested a review from njhill December 2, 2025 08:40

njhill added 2 commits December 2, 2025 10:42

return to use Request.__lt__

c62eeff

Signed-off-by: Nick Hill <[email protected]>

Merge branch 'main' into priorityrequest

89bfc64

njhill approved these changes Dec 2, 2025

View reviewed changes

njhill enabled auto-merge (squash) December 2, 2025 20:02

njhill merged commit 0a9caca into vllm-project:main Dec 2, 2025
46 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[Bugfix] fix --scheduling-policy=priority & n>1 crashes engine #29764

[Bugfix] fix --scheduling-policy=priority & n>1 crashes engine #29764

Uh oh!

chaunceyjiang commented Dec 1, 2025 •

edited by github-actions bot

Loading

Uh oh!

chatgpt-codex-connector bot commented Dec 1, 2025

Uh oh!

chaunceyjiang commented Dec 1, 2025

Uh oh!

njhill left a comment

Uh oh!

chaunceyjiang commented Dec 2, 2025

Uh oh!

njhill commented Dec 2, 2025

Uh oh!

chaunceyjiang commented Dec 2, 2025

Uh oh!

njhill commented Dec 2, 2025

Uh oh!

njhill left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

[Bugfix] fix --scheduling-policy=priority & n>1 crashes engine #29764

[Bugfix] fix --scheduling-policy=priority & n>1 crashes engine #29764

Uh oh!

Conversation

chaunceyjiang commented Dec 1, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose

Test Plan

Test Result

Uh oh!

chatgpt-codex-connector bot commented Dec 1, 2025

Uh oh!

chaunceyjiang commented Dec 1, 2025

Uh oh!

njhill left a comment

Choose a reason for hiding this comment

Uh oh!

chaunceyjiang commented Dec 2, 2025

Uh oh!

njhill commented Dec 2, 2025

Uh oh!

chaunceyjiang commented Dec 2, 2025

Uh oh!

njhill commented Dec 2, 2025

Uh oh!

njhill left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

chaunceyjiang commented Dec 1, 2025 •

edited by github-actions bot

Loading