As we saw in the previous article, workflows can be scheduled to run at a certain point in the future. A useful trick to know about is that scheduled workflows can be called recursively, or in other words, a scheduled workflow can schedule itself to run in the future.
Recursion is a technical term that roughly means a chunk of code invoking itself to run again. A real-world example of this is a rote way to look up a word in a dictionary. In that situation, the first "rule" you might follow is "pick up the dictionary and open it up to the halfway point - If your word is earlier in the alphabet than the halfway point, repeat this process with the half of the dictionary that came before the midway point and throw out the rest, or the opposite if the word is later in the alphabet than the halfway point". If you keep rigidly applying this rule, you'll eventually find the word you're looking for. This is just a simple example of recursion, but using recursion in your Bubble app can unlock some powerful behaviors.
Normal workflows can be used to process data (e.g. "Make changes to a list of things") and there's also the option to "Schedule API Workflow on a list" if you want to process data later. But, if you're processing a lot of data, either of these options can time out if the changes take too long to apply.
A better way to approach this would be to break the data processing into small chunks and have Bubble apply the changes to one small chunk of it, then apply the changes to the next small chunk, then the next small chunk, etc. This is recursive because after Bubble applies the change to one chunk, it tells itself to apply the same change but just on the next chunk of data at a later point in time; it's a workflow calling itself.
How does the workflow know what's defined as the next "chunk" of data to work on? That is logic you'll have to supply. For example, maybe you want to update a field of data on all Cars in your database, but you have hundreds of thousands of cars - doing that update in one go will probably time out. You could create a field like "data updated", and create an API Workflow that applies the change to a search of Cars where "data updated" is "no" and only does it for the first 500 entries; as part of the update, it sets "data updated" to "yes", and then scheduled the same API Workflow to run again 5 seconds later. When it runs again, it won't pick up the entries where "data updated" was already changed to "yes", so over time as the workflow runs again and again, it'll work through the list of all Cars.
(For more technical users: this feature is one way to create "for" loops over bigger lists or "do...while..." loops.)
Here's a quick table comparing "Schedule API Workflow on a list" vs. this tactic of scheduling the same API Workflow recursively:
|Schedule API Workflow on a list||Scheduling recursively|
|Workflows may overlap||Workflows run one at a time|
|No way of knowing when all workflows are done||Workflow can check if there is more data to process, and do something else if there isn't|
|Limited by the time / capacity it takes to search for the list and schedule all the items (max 5 minutes)||Can continue indefinitely|
|Faster and simpler for short lists (< 50-100 items)||More reliable for long lists|
|Can burn through a lot of capacity if workflows run in parallel||Capacity usage is limited to one workflow at a time|
- Start with something like a 5 second gap in between repeat instances of the scheduled workflow, so that you can test out the impact of this gap on your app's capacity (if things look OK, you can always try shorter gaps)
- Made a mistake and now you think you're stuck with an infinite loop? There are two ways out of it:
- Manually delete the next scheduled run in Logs > Scheduler
- Modify or delete the API Workflow itself; the next time it runs, it will use the updated (or non-existent) workflow
- You can pause all scheduled workflows for your app in Logs > Scheduler > "Pause tasks"