Streaming API
Last updated
Was this helpful?
Last updated
Was this helpful?
Streaming API is a method of transmitting data from a server to a client in real time, as the data becomes available, rather than waiting for the entire response to be prepared before sending it all at once. This approach contrasts with traditional request-response APIs, where the client sends a request and waits for the full response to be delivered in a single package.
With streaming, data is sent in small chunks (called data frames) over a persistent connection. This allows the client to start processing information immediately, improving perceived speed and responsiveness
When you select Stream from the list of data types in the API Connector in Bubble, you're telling Bubble how to handle the response it gets back from the API.
From a technical perspective, you're not choosing a different protocol (it’s still typically HTTPS), but you're defining how Bubble should handle the response.
Most APIs return a complete response all at once—typically in formats like JSON or XML. There's a request, and then a single response that ends the connection.
A streaming API works differently. It sends back a single response that remains open, allowing the server to deliver data gradually in chunks as it becomes available. Once all data has been sent, the server closes the connection to signal completion.
Streaming APIs are used in scenarios where it's useful or necessary to keep a connection open and receive data gradually over time, rather than all at once. For Bubble developers, one of the most common use cases involves integrating with large language models (LLMs) like OpenAI's ChatGPT.
With a traditional API response—such as one using JSON—the entire response is generated on the server and then sent to the client in a single package. This means the user doesn't see anything until the full message is ready, which can result in noticeable delays for complex or long responses.
Streaming changes that behavior. When streaming is enabled, the server starts sending parts of the response as soon as they’re ready. For example, with tools like ChatGPT or Claude, you might see the reply appear word by word or sentence by sentence, allowing users to follow along in real time as the model generates its output.
This approach improves perceived responsiveness and creates a more interactive user experience. While LLMs are a common use case today, streaming APIs are also widely used in other areas—such as financial market tickers, real-time analytics dashboards, and messaging bots—where timely, incremental data updates are essential.
Before setting up a streaming API, make sure of the following:
Ensure that the API service you are connecting to (such as ChatGPT) supports a streaming API response
Install the API Connector plugin
Set up the relevant API authentication for the service you want to connect to
Add your first API call
To instruct Bubble that the streaming API response will be streamed, set the data type to Stream:
It’s important to note that many API services require an explicit instruction to enable streaming. For example, OpenAI’s ChatGPT API will return a regular JSON response by default unless you include a specific parameter (such as "stream": true) in your request.
This means that for streaming to work correctly, both sides need to be configured appropriately: your Bubble app must be set up to handle a stream, and the external service must be told to send the streaming API response. If either side is not properly configured, the streaming API initialization process may fail.
In the example below, we’re sending a few parameters to ChatGPT to instruct it to return a streaming API response in the right way:
This specifies which version of the language model you want to use (e.g. gpt-4, gpt-3.5-turbo).
When you initialize an API call that returns a streaming API response, you'll receive a series of events known as chunks. Each chunk contains one or more fields, with each field representing a key-value pair of data returned from the stream.
To use this data effectively, you need to define how your app should handle each of these fields.
When initializing a streaming API call, the full response must be received before the initialization process can complete. As a result, initialization may take longer than with a standard (non-streaming) API call.
After initializing the call, the first step is to create a unique response field for every field you want to reference in your app. This allows you to access and work with the incoming data as it arrives in real time. Let’s continue the scenario of working with ChatGPT, and assume that you want to reference the following data:
Text stream - the incremental content generated by the model. When using streaming, this field updates continuously as each new token or word is returned by the model in real time.
Input tokens - the number of tokens used in the request payload.
Output tokens - The number of tokens generated in the response. This includes all tokens streamed or returned in the final output and helps you understand usage and billing impact.
To create new response field, follow these steps:
Initialize the call and wait for the Returned values popup to appear
Scroll down to Response fields
Click Add new field +
Give the response field a name, and select a data type
Response fields support the following types:
Text stream
Incremental text content (e.g. the content being streamed)
Yes
Text
A regular text value
No
Number
A regular numerical value
No
Yes/no
A true/false value
No
Just like regular JSON API calls, streaming APIs can be used both as a data source and within workflows. However, due to the continuous nature of streaming APIs, there is an important difference in how data is referenced within workflows, especially when using the Result of step X data source.
When initiating a streaming API request within a workflow, Bubble behaves slightly differently on the client side versus the server side:
Client-side behavior: The workflow action will appear as "finished" as soon as it begins receiving streamed data from the external API. This allows the workflow to move forward and execute subsequent actions immediately, provided these actions do not rely on the final, non-streamed values from the API request.
Server-side behavior: On the server, the workflow action will remain active (blocking) until the streaming API has fully completed and the connection is closed.
For example, if you have a workflow set up like this:
Step 1: Send request to ChatGPT (streaming)
Step 2: Save the chat message (final text stream) in the database (result of step 1)
The following implications apply:
Client-side actions that don't depend on the final API results can proceed without waiting for the stream to fully complete.
If an action references the final non-streamed result of the streamed request, the workflow will pause until the streaming has fully completed.
Server-side actions will always wait until the streamed API has fully completed, potentially delaying subsequent server-side operations.
Common use case:
A frequent scenario is initiating a streamed API call (like ChatGPT), then immediately displaying the incoming streamed data to the user through Bubble's Display data action. This provides users a seamless and responsive experience as the streamed content arrives gradually.
Each response field you configure in the API initialization popup automatically becomes a data source in your app. If you use the field type Text stream, Bubble also creates a few additional underlying data sources to support the streaming functionality:
Text stream
text so far
Text
The text that has been generated so far. This value updates in real time as new data is received from the stream.
full text
Text
The full text. This value is only available after the streaming is done.
Is done
Yes/no
Returns a yes if the stream is done.
Is waiting
Yes/no
Returns a yes if Bubble is awaiting a response.
Is streaming
Yes/no
Returns a yes if the stream is ongoing.
role: The of the message sender. Can be user, assistant, or system.
content: The text itself.