Harmony library often cannot parse refusals from gpt-oss-120b model

After digging into a user report in vLLM, [I discovered](https://github.com/vllm-project/vllm/issues/23567#issuecomment-3390164907) that when the gpt-oss-120b model refuses to do something, it often does not follow its harmony template properly and outputs refusals directly in the response, without the expected channel or message tokens beforehand.

Here's a simple Python script reproducing what I'm seeing, where the `generated_tokens` here are taken from the tokens generated by vLLM in response to the user's message.

```python
import sys

from openai_harmony import (
    HarmonyEncodingName,
    Role,
    StreamableParser,
    load_harmony_encoding
)
from transformers import AutoTokenizer

tokenizer = AutoTokenizer.from_pretrained("openai/gpt-oss-120b")

encoding = load_harmony_encoding(HarmonyEncodingName.HARMONY_GPT_OSS)
parser = StreamableParser(encoding, role=Role.ASSISTANT)

missing_tokens = [200008]

generated_tokens = [2161, 976, 1825, 382, 57246, 15543, 316, 2338, 261, 4232, 13790, 350, 72, 2560, 4213, 187928, 11, 23802, 8110, 741, 1328, 382, 829, 32510, 3100, 25, 15543, 97471, 198937, 13, 20953, 41897, 13, 16344, 316, 7562, 11, 581, 2804, 41897, 13, 200007, 200006, 173781, 200005, 35644, 200008, 2167, 2804, 41897, 13, 200007, 200006, 173781, 200005, 17196, 200008, 40, 4572, 23045, 11, 889, 357, 665, 1573, 1652, 483, 484, 13, 200002]

parse_failed = False
for i, token in enumerate(generated_tokens):
    decoded = tokenizer.decode(generated_tokens[:i+1])
    try:
        parser.process(token)
    except Exception as e:
        print(f"Error processing token {token}: {e}\n")
        print(f"Content up to and including the failed token:\n{decoded}\n")
        parse_failed = True
        break

if parse_failed:
    print("\n====================")
    print(f"Initial parse failed. Prepending tokens {missing_tokens}\n")
    parser = StreamableParser(encoding, role=Role.ASSISTANT)
    tokens = missing_tokens + generated_tokens
    for i, token in enumerate(tokens):
        decoded = tokenizer.decode(tokens[:i+1])
        parser.process(token)

print(f"Successfully parsed content:\n{decoded}\n")

print(f"Parsed messages:\n{parser.messages}")

```

Here's the actual text and token ids generated by the model from vLLM:

`CompletionOutput(index=0, text='...\n\nThe user is requesting instructions to break a food truck (i.e., sabotage, illegal activity). This is disallowed content: instructions facilitating wrongdoing. Must refuse. According to policy, we must refuse.assistantanalysisWe must refuse.assistantfinalI’m sorry, but I can’t help with that.', token_ids=[2161, 976, 1825, 382, 57246, 15543, 316, 2338, 261, 4232, 13790, 350, 72, 2560, 4213, 187928, 11, 23802, 8110, 741, 1328, 382, 829, 32510, 3100, 25, 15543, 97471, 198937, 13, 20953, 41897, 13, 16344, 316, 7562, 11, 581, 2804, 41897, 13, 200007, 200006, 173781, 200005, 35644, 200008, 2167, 2804, 41897, 13, 200007, 200006, 173781, 200005, 17196, 200008, 40, 4572, 23045, 11, 889, 357, 665, 1573, 1652, 483, 484, 13, 200002], cumulative_logprob=None, logprobs=None, finish_reason=stop, stop_reason=None)`

You'll see that it fails to generate a `<|message|>` token before starting into the refusal chain of thought. It also doesn't generate a `<|channel|>`, but that's not actually a fatal error like the lack of `<|message|>` is in the Harmony library here.

I considered attempting to workaround this in vLLM, but it feels more like a model output and/or Harmony library issue. The script I gave does show one example of how this can be worked around, by explicitly preprending token 200008 (`<|message|>`) before parsing the content with the Harmony library.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Harmony library often cannot parse refusals from gpt-oss-120b model #80

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Harmony library often cannot parse refusals from gpt-oss-120b model #80

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions