Concurrency limits, SSH access, and timer nodes

Concurrency limiter

Set max_concurrent_runs in server config to control how many runs execute simultaneously. Additional start-requested runs wait as runnable before moving to starting, with both states visible in the UI and API.

Previously, starting too many runs at once could overwhelm the machine. Now excess runs wait in a queue and start as capacity frees up.

SSH access to running sandboxes

Use fabro ssh <run-id> to get SSH access into running Daytona sandboxes for live debugging while the workflow executes. When something goes wrong mid-run, you can drop into the sandbox, inspect the filesystem, and understand the problem without waiting for the run to finish.

Use --preserve-sandbox to keep sandboxes alive after a run completes for post-mortem inspection.

API

All list endpoints now support offset-based pagination with offset and limit parameters
New GET /usage endpoint returns aggregate token counts and estimated costs broken down by model
Hot config reload — changes to server.toml are detected and applied without restarting the server
All error responses now use a consistent JSON structure with error codes and messages

Fixes

Fixed HTTP 529 (Overloaded) responses from LLM providers being misclassified as non-retryable
Fixed progress display panic when tool calls contain long whitespace sequences

Improvements

Gemini 3.1 Flash Lite added to the model catalog for ultra-fast, low-cost tasks
Multi-provider ensemble demo workflow showing fan-out across models and result merging
Parallel branches show in progress UI; sandbox details with hyperlinks display during runs

Concurrency limits, SSH access, and timer nodes

Concurrency limiter

SSH access to running sandboxes

Timer nodes

More

​Concurrency limiter

​SSH access to running sandboxes

​Timer nodes

​More

Concurrency limiter

SSH access to running sandboxes

Timer nodes

More