You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: app/ai-gateway/streaming.md
-45Lines changed: 0 additions & 45 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -139,51 +139,6 @@ The following is an example `llm/v1/completions` route streaming request:
139
139
140
140
You should receive each batch of tokens as HTTP chunks, each containing one or many server-sent events.
141
141
142
-
### Token usage in streaming responses {% new_in 3.13 %}
143
-
144
-
You can receive token usage statistics in an SSE streaming response. Set the following parameter in the request JSON:
145
-
146
-
```json
147
-
{
148
-
"stream_options": {
149
-
"include_usage": true
150
-
}
151
-
}
152
-
```
153
-
154
-
When you set this parameter, the `usage` object appears in the final SSE frame, before the `[DONE]` terminator. This object contains token count statistics for the request.
155
-
156
-
157
-
The following example shows how to request and process token usage statistics in a streaming response:
158
-
159
-
```python
160
-
from openai import OpenAI
161
-
162
-
client = OpenAI(
163
-
base_url="http://127.0.0.1:8000/openai",
164
-
api_key="none"
165
-
)
166
-
167
-
stream = client.chat.completions.create(
168
-
model="gpt-4",
169
-
messages=[{"role": "user", "content": "Tell me the history of Kong Inc."}],
170
-
stream=True,
171
-
stream_options={"include_usage": True}
172
-
)
173
-
174
-
for chunk in stream:
175
-
if chunk.choices and chunk.choices[0].delta.content:
> This feature works with any provider and model when `llm_format` is set to `openai` mode.
184
-
>
185
-
> See the [OpenAI API Documentation](https://platform.openai.com/docs/api-reference/chat/create#chat_create-stream_options) for more information on stream options.
186
-
187
142
### Response streaming configuration parameters
188
143
189
144
In the AI Proxy and AI Proxy Advanced plugin configuration, you can set an optional field `config.response_streaming` to one of three values:
0 commit comments