Skip to main content
Experimental feature. Retros are disabled by default. Enable them by setting retros = true under [run.execution] in your project or workflow config.
After every workflow run, Fabro can generate a retro — a structured retrospective that captures what happened, what went well, and what didn’t. Retros combine deterministic metrics extracted from the run’s checkpoint with a qualitative narrative produced by an LLM agent that analyzes the full event stream. The goal is continuous improvement. Retros give you a searchable history of how your workflows perform over time, surface friction patterns that would otherwise go unnoticed, and identify follow-up work before it falls through the cracks.
Fabro web UI Retros list showing runs with Smooth, Bumpy, Effortless, and Struggled ratings

What’s in a retro

A retro has two layers: quantitative stats derived cheaply from checkpoint data, and an agent-generated narrative that interprets the run holistically.

Quantitative layer

The quantitative layer is extracted directly from the checkpoint and event stream — no LLM calls required:
FieldDescription
Per-stage breakdownDuration, retry count, cost, files touched, status, and failure reason for each stage
Aggregate statsTotal duration, total cost, total retries, all files touched, stages completed vs. failed

Narrative layer

An LLM agent reads the run’s full event stream and produces a structured analysis:
FieldDescription
SmoothnessOverall rating on a 5-point scale (see below)
IntentWhat the run was trying to accomplish
OutcomeWhat actually happened
LearningsWhat was discovered about the repo, code, workflow, or tools
Friction pointsWhere things got stuck and why
Open itemsFollow-up work, tech debt, test gaps, or investigations identified
The agent has tool access to grep and read the event stream, so it can inspect actual tool call patterns, error messages, and approach pivots — not just pass/fail signals.

Smoothness ratings

Every retro includes a smoothness rating that grades the overall quality of the run’s execution:
RatingMeaning
EffortlessGoal achieved on the first try. No retries, no wrong approaches. Agent moved efficiently from start to finish.
SmoothGoal achieved with minor hiccups — 1–2 retries or a brief wrong approach quickly corrected. No human intervention needed.
BumpyGoal achieved but with notable friction: multiple retries, at least one significant wrong approach, or substantial time on dead ends.
StruggledGoal achieved only with difficulty: many retries, major approach changes, human intervention, or partial failures requiring recovery.
FailedRun did not achieve its stated goal. Some stages may have completed, but the overall intent was not fulfilled.
The rating considers the full context visible in agent events — tool call patterns, error recovery sequences, approach pivots — not just stage pass/fail counts.

Learnings, friction points, and open items

Learnings

Learnings capture what was discovered during the run, categorized by type:
CategoryExamples
repoRepository structure, build system quirks, CI configuration
codeBug root causes, module boundaries, API contracts
workflowNode ordering issues, missing stages, prompt improvements
toolTool limitations, MCP server behavior, command output parsing

Friction points

Friction points identify where the run got stuck and what caused the slowdown:
KindDescription
retryA stage needed multiple attempts
timeoutA stage or tool call hit a time limit
wrong_approachThe agent pursued a dead end before pivoting
tool_failureA tool or command failed unexpectedly
ambiguityUnclear requirements or conflicting signals caused confusion
Each friction point can optionally reference the stage_id where it occurred.

Open items

Open items capture follow-up work identified during the run:
KindDescription
tech_debtCode quality issues worth addressing later
follow_upWork that’s related but out of scope for this run
investigationUnknowns that need further research
test_gapMissing test coverage discovered during the run

How retros are generated

Retro generation happens in two phases after a run completes:
  1. Derive — Fabro extracts stage durations from durable run events and builds a retro from the checkpoint data. This is deterministic, fast, and produces the quantitative layer.
  2. Narrate — An LLM agent session analyzes the run data. The agent receives progress.jsonl, run.json, graph.fabro, and per-stage files under stages/{node_id}@{visit}/... inside its sandbox so it can grep and read the event stream, run snapshot, workflow source, and full stage payloads. The narrative fields are merged back into durable retro state.
Both phases run automatically at the end of every CLI run. The API server derives the quantitative layer but does not currently run the narrative agent.
Fabro web UI run retro showing Smooth rating, duration, cost, learnings, and follow-up items

Accessing retros

CLI

To enable retros for your project, set retros = true under [run.execution] in your .fabro/project.toml:
.fabro/project.toml
_version = 1

[run.execution]
retros = true
To skip retro generation for a single run when retros are enabled, pass --no-retro:
fabro run workflow.fabro --no-retro
Retros can also be enabled server-wide in settings.toml:
settings.toml
_version = 1

[run.execution]
retros = true

API

Retros are also available via the REST API. See the list retros and retrieve retro API reference pages.

Storage

Retros are stored in durable run state. If you need files on disk, fabro dump materializes retro text under stages/retro/ alongside run.json, stage files, and the rest of the exported run data.