Concurrency limiter
Setmax_concurrent_runs in server config to control how many runs execute simultaneously. Additional runs queue automatically with queued and starting states visible in the UI and API.
Previously, starting too many runs at once could overwhelm the machine. Now excess runs wait in a queue and start as capacity frees up.
SSH access to running sandboxes
Use--ssh to get SSH access into running Daytona sandboxes for live debugging while the workflow executes. When something goes wrong mid-run, you can drop into the sandbox, inspect the filesystem, and understand the problem without waiting for the run to finish.
--preserve-sandbox to keep sandboxes alive after a run completes for post-mortem inspection.
Timer nodes
Addwait.timer nodes to workflows that pause for a configured duration. Useful for rate limiting between API calls or waiting for external processes to complete.
More
API
API
- All list endpoints now support offset-based pagination with
offsetandlimitparameters - New
GET /usageendpoint returns aggregate token counts and estimated costs broken down by model - Hot config reload — changes to
server.tomlare detected and applied without restarting the server - All error responses now use a consistent JSON structure with error codes and messages
Fixes
Fixes
- Fixed HTTP 529 (Overloaded) responses from LLM providers being misclassified as non-retryable
- Fixed progress display panic when tool calls contain long whitespace sequences
Improvements
Improvements
- Gemini 3.1 Flash Lite added to the model catalog for ultra-fast, low-cost tasks
- Multi-provider ensemble demo workflow showing fan-out across models and result merging
- Parallel branches show in progress UI; sandbox details with hyperlinks display during runs