Email or username:

Password:

Forgot your password?
Simon Willison

New TIL: How streaming LLM APIs work

I put together some notes after poking around with the OpenAI, Anthropic and Google Gemini streaming APIs

til.simonwillison.net/llms/str

6 comments
Kev_Prime

@simon nice breakdown of the requests and expected responses.

velaia

@simon Great post, Simon.

Do you have any idea why all 3 providers use POST and not GET that would work with the EventSource API?

Simon Willison

@velaia my guess is that OpenAI did that first because they were worried prompts would be too long to send over GET, then everyone else followed their lead

Leon Brocard

@simon @velaia Nice. I wonder in the future if the QUERY method is accepted then we can then run server sent events with large payload requests. ietf.org/archive/id/draft-ietf

Stefan Eissing

@simon Nice.

Little note: on a recent curl, you can POST JSON with `curl --json <string>, saving the header setting.
Also, `--no-buffer` should no longer be necessary.

Update: `--no-buffer` always `fflush()`es the output in curl. So it might still be beneficiary.

Go Up