Revert "Update streaming document"

tomek-labuk · tomek-labuk · commit 3fb101d42e1d · 2025-11-27T09:15:54.000+01:00
This reverts commit 8303169.
diff --git a/app/ai-gateway/streaming.md b/app/ai-gateway/streaming.md
@@ -139,51 +139,6 @@ The following is an example `llm/v1/completions` route streaming request:
 
 You should receive each batch of tokens as HTTP chunks, each containing one or many server-sent events.
 
-### Token usage in streaming responses {% new_in 3.13 %}
-
-You can receive token usage statistics in an SSE streaming response. Set the following parameter in the request JSON:
-
-```json
-{
-  "stream_options": {
-    "include_usage": true
-  }
-}
-```
-
-When you set this parameter, the `usage` object appears in the final SSE frame, before the `[DONE]` terminator. This object contains token count statistics for the request.
-
-
-The following example shows how to request and process token usage statistics in a streaming response:
-
-```python
-from openai import OpenAI
-
-client = OpenAI(
-    base_url="http://127.0.0.1:8000/openai",
-    api_key="none"
-)
-
-stream = client.chat.completions.create(
-    model="gpt-4",
-    messages=[{"role": "user", "content": "Tell me the history of Kong Inc."}],
-    stream=True,
-    stream_options={"include_usage": True}
-)
-
-for chunk in stream:
-    if chunk.choices and chunk.choices[0].delta.content:
-        print(chunk.choices[0].delta.content, end="", flush=True)
-    if chunk.usage:
-        print("\nDONE. Usage stats:\n")
-        print(chunk.usage)
-```
-
-{:.info}
-> This feature works with any provider and model when `llm_format` is set to `openai` mode.
->
-> See the [OpenAI API Documentation](https://platform.openai.com/docs/api-reference/chat/create#chat_create-stream_options) for more information on stream options.
-
 ### Response streaming configuration parameters
 
 In the AI Proxy and AI Proxy Advanced plugin configuration, you can set an optional field `config.response_streaming` to one of three values: