How context flows
The context starts empty at the beginning of a run. As each stage completes, its outcome includes a set of context updates — key-value pairs that are merged into the shared context. Later stages see all updates from earlier stages.parallel.fan_in.* keys.
How agents access context
Agents do not have a tool to query the context store directly. Instead, context is made available through two mechanisms:- Preamble injection — When a node starts, Fabro assembles a preamble from the current context and prepends it to the node’s prompt. The fidelity setting controls how detailed this preamble is.
- Context updates — Agents can write to the context by including a
context_updatesobject in a JSON response. See Transitions for the response format.
Keys set by handlers
Each handler type writes specific keys into the context after execution:Agent and prompt nodes
| Key | Value |
|---|---|
last_stage | The node ID of the stage that just completed |
last_response | Truncated LLM response (first 200 characters) |
response.{node_id} | Full LLM response text |
context_updates field in their response. See Transitions.
Command nodes
| Key | Value |
|---|---|
command.output | The command’s stdout |
command.stderr | The command’s stderr |
Human gates
| Key | Value |
|---|---|
human.gate.selected | The accelerator key (e.g. "A") or "freeform" |
human.gate.label | The full label of the selected edge |
human.gate.text | The user’s freeform text (if applicable) |
Parallel merge (fan-in)
| Key | Value |
|---|---|
parallel.fan_in.best_id | Node ID of the best-performing branch |
parallel.fan_in.best_outcome | Status of the best branch |
parallel.fan_in.best_head_sha | Git SHA from the best branch (if applicable) |
Engine-managed keys
The engine sets several keys automatically. These are prefixed withinternal. and are excluded from preambles:
| Key | Value |
|---|---|
internal.run_id | Unique identifier for this run |
internal.work_dir | Working directory path |
internal.fidelity | The resolved fidelity mode for the current node |
internal.thread_id | Thread ID for shared-conversation nodes (or null) |
internal.node_visit_count | How many times the current node has been visited |
internal.retry_count.{node_id} | Number of retry attempts used by a node |
outcome | Status of the last completed stage (success, fail, etc.) |
failure_class | Classification of the last failure (if any) |
failure_signature | Deduplication signature for the last failure |
preferred_label | Label selected by a human gate or agent routing directive |
current_node | ID of the node currently executing |
graph.goal | The workflow’s goal attribute |
graph.{attr} | All graph-level attributes, mirrored into context |
Using context in conditions
Edge conditions can read context values to route execution:context. prefix is optional — tests_passed=true and context.tests_passed=true are equivalent.
Engine-managed keys like internal.node_visit_count work in conditions too. This is useful for fixed-count loops:
Fidelity: Controlling agent context
When a new agent or prompt node starts, Fabro assembles a preamble — a summary of what happened in prior stages. The fidelity setting controls how detailed this preamble is.| Fidelity | Behavior |
|---|---|
full | No preamble. The agent continues in the same conversation thread, seeing complete prior context. |
compact | Nested-bullet summary with handler-specific details (default). |
summary:high | Detailed per-stage Markdown report. |
summary:medium | Moderate detail with outcomes and notable findings (~1500 token target). |
summary:low | Brief summary with just outcomes per stage (~600 token target). |
truncate | Minimal — only the goal and run ID. |
Setting fidelity
Fidelity can be set at three levels. The first match wins:-
Edge attribute — Set
fidelityon an edge to control the transition into a specific node: -
Node attribute — Set
fidelityon a node to control all transitions into it: -
Graph default — Set
default_fidelityon the graph for a run-wide default:
compact.
Full fidelity and threads
full fidelity is typically used with thread_id to create a shared conversation across multiple nodes. Nodes with the same thread_id share a single LLM session, preserving full context continuity:
Fidelity on resume
When a run is resumed from a checkpoint, the first node after resume degradesfull fidelity to summary:high. This prevents the resumed node from expecting a conversation thread that no longer exists in memory.
Preamble construction
For non-full fidelity modes, Fabro builds a preamble from runtime data and prepends it to the node’s prompt.
Example compact preamble
Example compact preamble
Given a plan-test-implement workflow where the plan and test stages have completed, the running 42 tests … 41 passed, 1 failed
implement node would receive a preamble like this prepended to its prompt:- The workflow goal
- A summary of completed stages (format depends on fidelity level)
- Handler-specific details:
- Command nodes — the script that ran, stdout, and stderr
- Agent/prompt nodes — the model used, token counts, and files touched
- Non-internal context values that weren’t already rendered inline
internal., current, graph., thread., response.) are excluded from preambles to avoid noise.
Artifact offloading
When a stage produces a large output (over 100KB of serialized JSON), Fabro automatically offloads it to the artifact store on disk rather than keeping it in the in-memory context. The context value is replaced with afile:// pointer:
{working_directory}/.fabro/artifacts/ so agents can read them.
This keeps the context lean — large LLM responses, test output, and file listings don’t bloat checkpoint files or overwhelm preamble summaries.
Context compaction
During long-running agent sessions, the conversation history can grow large enough to exceed the LLM’s context window. Context compaction automatically summarizes older turns and replaces them with a structured summary, keeping the session running without manual intervention. Compaction is always enabled and runs with hardcoded defaults — there are no user-facing configuration options.When it triggers
After every assistant turn, Fabro estimates the total token usage of the system prompt and conversation history (using a rough heuristic of 1 token per 4 characters). If the estimate exceeds 80% of the model’s context window, compaction runs.How it works
- Split history — The conversation is divided into older turns (to be summarized) and the most recent 6 turns (preserved verbatim).
- Render old turns — Older turns are serialized to a human-readable text format. Tool call arguments and tool results are truncated to 500 characters each.
- Summarize via LLM — Fabro makes a non-streaming LLM call (using the same model and provider as the agent session) with a structured summarization prompt. The summary uses these sections:
- Goal — what the user asked for
- Progress — what was accomplished, with file paths and key decisions
- Key Decisions — important choices and their rationale
- Failed Approaches — what was tried and didn’t work
- Open Issues — remaining bugs, edge cases, or TODOs
- Next Steps — what should happen next
- File Operations — if the agent has tracked file modifications, this section is copied verbatim into the summary
- Replace history — All old turns are removed and replaced with a single
[Context Summary]system message containing the structured summary. The preserved recent turns remain unchanged.
Failure behavior
Compaction failures are non-fatal. If the summarization LLM call fails, Fabro emits anAgent.Error event and the session continues with the original uncompacted history. The next assistant turn will trigger another compaction attempt.
Events
Compaction emits three events to the event stream:| Event | When | Key fields |
|---|---|---|
Agent.ContextWindowWarning | Token estimate exceeds 80% of context window | estimated_tokens, context_window_size, usage_percent |
Agent.CompactionStarted | Compaction begins | estimated_tokens, context_window_size |
Agent.CompactionCompleted | Summary generated and history replaced | original_turn_count, preserved_turn_count, summary_token_estimate, tracked_file_count |