Streaming

Server-Sent Events im OpenAI-Chunk-Format — ohne eigenen Parser.

CleverRouter streamt Chat-Completions als Server-Sent Events im OpenAI-Chunk-Format. Jede SDK, die OpenAI versteht, kann den Stream konsumieren.

Stream anfordern

stream.ts
import OpenAI from 'openai';

const client = new OpenAI({
  apiKey: process.env.CLEVERROUTER_API_KEY!,
  baseURL: 'https://cleverouter.eu/v1',
});

const stream = await client.chat.completions.create({
  model: 'mistral/mistral-small-3.2',
  messages: [{ role: 'user', content: 'Schreib einen Haiku über Berlin.' }],
  stream: true,
  stream_options: { include_usage: true },
});

let usage;
for await (const chunk of stream) {
  if (chunk.usage) usage = chunk.usage;
  process.stdout.write(chunk.choices[0]?.delta.content ?? '');
}
console.log('\nusage', usage);

Raw-Wire-Format

Wenn du nicht über die SDK gehst (Edge, Bun, Hono):

data: {"id":"chatcmpl-x","object":"chat.completion.chunk","choices":[{"delta":{"content":"Hallo"},"index":0}]}

data: {"id":"chatcmpl-x","object":"chat.completion.chunk","choices":[{"delta":{"content":", "},"index":0}]}

data: [DONE]

Response-Headers:

content-type: text/event-stream; charset=utf-8
cache-control: no-cache, no-transform
x-accel-buffering: no

Reverse-Proxies

CleverRouter setzt X-Accel-Buffering: no, damit nginx/Cloudflare Chunks nicht puffern. Hinter eigenen Proxies bitte denselben Header durchreichen.

Tool-Calls werden ebenfalls als Deltas gestreamt — siehe Tool Use.

On this page

On this page