Dark Factory

The term “dark factory” comes from manufacturing. Since 2001, FANUC has operated a factory near Mt. Fuji where robots build other robots — running unsupervised for up to 30 days at a time. The factory is “dark” because no humans are present and robots don’t need light. In software, the dark factory concept is different. It doesn’t mean zero human involvement — it means minimal human interaction with the code itself. Humans supervise the specs, guardrails, and outcomes, not each line of code. Engineers shift from writing and reviewing code to defining what should be built, how quality is measured, and when to intervene. This is an aspirational concept, and getting there is iterative.

From coding to orchestrating

Most teams today have AI writing code while humans review it line by line. The hardest transition is moving beyond that — replacing ad-hoc human review with structured, repeatable verification that you actually trust. Dan Shapiro’s five-level framework describes this progression well.

What makes it work

The dark factory isn’t a single tool or practice. It’s a set of capabilities that compound: Declarative workflows over imperative prompts. When the process is a version-controlled graph — not a chat transcript — you can review, iterate, and share it like any other source file. The workflow itself becomes the specification of how work gets done. Deterministic verification over human review. Test suites, linters, type checkers, and LLM-as-judge evaluations replace line-by-line code review. Failures route back to fix loops automatically. Humans define the criteria; the system enforces them. Multi-model ensembles over single-model dependence. Using different models for implementation and verification breaks the circularity problem — where the builder and inspector share the same blind spots. Cross-critique with fresh eyes catches what self-review misses. Checkpointed execution over black-box runs. Git commits after every stage create an audit trail. When something goes wrong, you can inspect, revert, or fork from any point — without having watched the run live. Continuous improvement over static processes. Automatic retrospectives after every run feed a learning loop. Workflows get better over time, not just the code they produce.

The human role in a dark factory

The dark factory doesn’t eliminate engineering judgment. It redirects it:

Before	After
Writing code	Defining workflows and prompts
Reviewing diffs	Defining verification criteria
Debugging test failures	Designing fix loops
Watching agent sessions	Reviewing retrospectives
Manual quality checks	Tuning goal gates and evals

The goal is to spend your time on the parts that require human judgment — what to build, how to verify it, and when something doesn’t look right — while the factory handles the rest.

Workflows

Learn how workflow graphs orchestrate agents, commands, and human gates.

Human-in-the-Loop

Control where and how humans intervene in workflows.

Quality Verification

Build verification into your workflows.

Retros

Automatic retrospectives for continuous improvement.

Getting Started

Core Concepts

Defining Workflows

Executing Workflows

Human Tools

Agents

Integrations

Reference

Dark Factory

From coding to orchestrating

What makes it work

The human role in a dark factory

Further reading

Workflows

Human-in-the-Loop

Quality Verification

Retros

Getting Started

Core Concepts

Defining Workflows

Executing Workflows

Human Tools

Agents

Integrations

Reference

Documentation Index

​From coding to orchestrating

​What makes it work

​The human role in a dark factory

​Further reading

Workflows

Human-in-the-Loop

Quality Verification

Retros

From coding to orchestrating

What makes it work

The human role in a dark factory

Further reading