Agent-Driven Software Factory: Replace Vibe Coding With a 12-Agent Pipeline¶

Vibe coding is not a software development process.

Fire a prompt, accept the result, ship it. Fast in the same way skipping tests is fast — until something breaks and you can't explain why. The output is plausible-looking code with no validated problem, no verified requirements, no threat model, no acceptance tests, no deployment strategy. One feature. Eleven things that were supposed to happen but didn't.

The Software Factory Agent replaces that with a 12-agent pipeline. One developer runs the chain. Specialized agents handle market research, product scoping, UX design, security, implementation, testing, validation, and deployment — each in its own clean context window with only the tools it needs.

Two modes. Discovery mode (12 agents) validates the market, defines the product, and designs the UX before any engineering starts. Feature mode (9 agents) skips straight to implementation for incremental work on an existing product.

Two Modes, One Pipeline¶

DISCOVERY MODE

FEATURE MODE

The 12 Agents¶

Discovery Agents (discovery mode only)¶

Agent 1: Market Researcher¶

Tools: Read (read-only)

Never invents data. The Market Researcher runs before scope, before design, before code. If information is missing, it surfaces open questions — it does not guess.

Six-section Market Research Report:

Problem statement — who has this problem, when it occurs, what it costs to leave unsolved
Target customer segments — primary and secondary personas; current workaround and why it falls short
Competitor landscape — direct competitors, indirect substitutes, status quo (doing nothing)
Top 3–5 use cases — actor, trigger, goal, current friction
Market risks — nice-to-have vs. must-have, size, regulation, timing
Open questions — anything blocking the Product Manager

Every downstream agent builds on this report. Neither the PM nor the UX Designer invents customer data.

Agent 2: Product Manager¶

Tools: Read (read-only)

Must Have features are capped at 5. More than 5 means the scope is too wide — the PM cuts harder.

Six-section Product Brief:

MVP scope statement — what the MVP does and explicitly does not do
Core features (MoSCoW) — Must Have / Should Have / Could Have / Won't Have in MVP
Primary user journey — acquisition through value realisation, with drop-off risks per stage
Pricing model — one recommendation with rationale and risks
Success metrics — 3–5 outcomes; at least one measures actual user value, not vanity
Open questions and blockers — what must be resolved before UX or engineering starts

Every Must Have feature must trace to a use case in the Market Research Report. If it can't, it doesn't belong in the MVP.

Agent 3: UX Designer¶

Tools: Read (read-only)

ASCII wireframes, not descriptions. The UX Designer produces a UX Design Document the Frontend Builder implements directly — no interpretation gap between what was designed and what was built.

Six-section UX Design Document:

Page inventory — every screen: route, purpose, who sees it; includes error and empty states
Navigation structure — top-level and contextual nav as an indented hierarchy
User flows — happy path plus the two most likely error paths per core use case
ASCII wireframes — layout and element hierarchy for the 3–5 most important screens
Dashboard and report layouts — metrics shown, card layout, empty states, filter/sort behaviour
Component inventory — every reusable component with states and interaction notes

The UX Design Document feeds the Spec Writer (source of truth for frontend scope) and the Frontend Builder (direct implementation input). The UI is designed once, not improvised.

Human checkpoint 1: discovery approval. Combined review of all three discovery outputs — market problem, MVP scope, MoSCoW list, UX wireframes. Engineering doesn't start until you approve.

Targeted revision loops — no need to restart the chain:

revise market — reruns Market Researcher → PM → UX Designer
revise product — reruns PM → UX Designer only
revise ux — reruns UX Designer only

Engineering Agents (both modes)¶

Agent 4: Codebase Researcher¶

Tools: Read, Grep, Glob (read-only)

No guessing. Runs first in feature mode; runs after discovery approval in discovery mode. In discovery mode it uses the PM's refined scope to focus the research.

Produces:

Relevant files and patterns for the feature
Existing code that does similar things — so builders don't reinvent
Risk flags: multi-tenancy, timezone handling, auth boundaries, retry behaviour
Tests that need updating once the feature ships

Feeds directly into the Story Writer and Spec Writer.

Agent 5: Story Writer¶

Tools: Read (read-only)

Turns the Codebase Researcher's findings into a precise user story the whole chain builds against. In discovery mode, the Product Brief and UX Design Document are the scope authority — no invented requirements.

Output:

User story — As a [role], I want [capability] so that [outcome]
Acceptance criteria — each binary: pass or fail
Edge cases — empty inputs, concurrent requests, permission failures
Out-of-scope items — explicit list
Open questions — anything the Researcher couldn't answer

Human checkpoint 2 in discovery mode / checkpoint 1 in feature mode. Read the story, check the criteria, approve or revise.

Agent 6: Spec Writer¶

Tools: Read, Grep, Glob (read-only)

Translates the approved story into a technical blueprint. In discovery mode, the UX Design Document is the source of truth for frontend scope — the Spec Writer does not redesign the UI.

Output:

Data model — tables, columns, relationships, index strategy
API contract — endpoints, request/response shapes, status codes, error formats
Frontend requirements — component tree, state, API consumption (from UX Design Document)
Background jobs — trigger conditions, retry policy, failure behaviour
Complete file list — every file to create or modify, with its purpose

Human checkpoint 3 in discovery mode / checkpoint 2 in feature mode. A mistake in the data model costs five minutes here. It costs a weekend after migration runs.

Agent 7: Security Architect¶

Tools: Read, Grep, Glob (read-only)

STRIDE threat model on the approved brief — before the first builder starts.

Category	Question
Spoofing	Can an attacker impersonate a user or system?
Tampering	Can data be modified in transit or at rest?
Repudiation	Can malicious actions be denied after the fact?
Information disclosure	Can sensitive data reach unauthorized parties?
Denial of service	Can the feature be made unavailable?
Elevation of privilege	Can a lower-privileged user gain higher access?

Output: a mandatory security controls list every builder must implement. Critical findings pause the chain — redesign the brief or accept the risk with a recorded reason.

Agent 8: Backend Builder¶

Tools: Read, Edit, Write, Bash (backend files only)

Scope is enforced in the system prompt, not by convention. It cannot touch frontend files.

Builds:

API routes and server actions
Service layer
Database migrations
Background jobs
Unit tests alongside the code — not after

Produces a file-change summary and implemented API contract for the Frontend Builder.

Agent 9: Frontend Builder¶

Tools: Read, Edit, Write, Bash (frontend files only)

In discovery mode: reads the Backend Builder's API summary and the UX Design Document. Implements the wireframes directly — no invented layouts, no improvised flows.

Builds:

React components and pages matching the UX wireframes
State management hooks
Form handling and validation
Component tests

If the backend is missing something, the Frontend Builder reports the gap. It does not cross the line and fix it.

Agent 10: Test Verifier¶

Tools: Read, Edit, Write, Bash (test files only)

Writes acceptance tests against every criterion in the approved user story — not unit tests, which the builders already wrote. Tests from the outside: what a real user does, what the system returns, what the database contains after.

Output: a pass/fail report per acceptance criterion. Any failure loops back to the responsible builder. No criterion is marked close enough.

Agent 11: Implementation Validator¶

Tools: Read, Grep, Glob (read-only)

Compares finished code against the user story, technical brief, and security controls list. In discovery mode, also checks against the UX Design Document.

Gap report by severity:

Critical — blocks the PR; loop back to the relevant builder
High — fix before merging
Medium — file as follow-up
Low / Informational — noted

The Validator never fixes anything. Its job is to tell the truth about whether implementation matches what was approved.

Agent 12: DevOps Engineer¶

Tools: Read, Edit, Write, Bash (pipeline files only)

Cannot touch application source code.

Builds the CI/CD pipeline: build, test, deploy across dev → test → qa → staging → prod, env var manifest, rollback strategy.

Build once, promote everywhere. One artifact promoted through every environment. Code is never rebuilt per environment. Staging runs exactly what production will run.

Human checkpoint 4 in discovery mode / checkpoint 3 in feature mode: full diff, acceptance test results, Validator gap report, secrets to provision.

Human Checkpoints¶

Discovery mode — 4 checkpoints¶

Checkpoint	Trigger	What to review
1. Discovery approval	After UX Designer	Market problem, MVP scope, MoSCoW list, wireframes
2. Story approval	After Story Writer	Right problem? Acceptance criteria correct?
3. Brief approval	After Spec Writer	Data model right? API shapes correct?
4. PR review	After DevOps Engineer	Diff, test results, validator report, secrets

Feature mode — 3 checkpoints¶

Checkpoint	Trigger	What to review
1. Story approval	After Story Writer	Right problem? Acceptance criteria correct?
2. Brief approval	After Spec Writer	Data model right? API shapes correct?
3. PR review	After DevOps Engineer	Diff, test results, validator report, secrets

Plus one conditional checkpoint in both modes: Security Architect Critical finding → redesign or accept risk with a recorded reason.

The AI executes. You decide.

The Pre-Commit Secret Scanner¶

Blocks commits containing secrets before they reach the repository.

Blocked filenames: .env, .env.local, .env.production, .key, .pem, .p12, credentials.json, serviceAccountKey.json

Blocked patterns:

-----BEGIN RSA PRIVATE KEY-----
sk_live_     (Stripe live keys)
ghp_          (GitHub PATs)
AKIA          (AWS access keys)
AIza          (Google API keys)
xoxb-         (Slack bot tokens)

Install once:

git init
ln -sf ../../.claude/hooks/pre-commit .git/hooks/pre-commit

Design Principles¶

Discovery before code. Market Researcher, Product Manager, and UX Designer run before any engineering. No code is written against a vague idea.

One job per agent. Wrong assumptions surface at the point of origin — not six steps later in production.

Read-only gates. 7 of 12 agents cannot edit files. Every design decision is made on paper.

UX as a first-class input. Wireframes and component inventory feed the Spec Writer and Frontend Builder directly. The UI is designed once.

Hard scope enforcement. Backend Builder → backend only. Frontend Builder → frontend only. DevOps Engineer → pipeline only. Enforced in the system prompt.

Security at design time. STRIDE runs after the brief, before the builders. Fixing a threat model costs a paragraph. Fixing it in production costs a rewrite and a disclosure.

Getting Started¶

Step 1: Clone and configure¶

git clone https://github.com/pkhamdee/software-factory-agent.git
cd software-factory-agent

Fill in every  marker in CLAUDE.md — stack, commands, directory layout, testing policy. The Codebase Researcher reads CLAUDE.md on every run. Incomplete context produces incomplete output.

Step 2: Install the git hook¶

git init
ln -sf ../../.claude/hooks/pre-commit .git/hooks/pre-commit

Step 3: Choose your mode¶

New product — full discovery chain:

/feature-factory New product: a subscription analytics dashboard for SaaS founders

Type new when prompted. Runs all 12 agents with 4 human checkpoints.

Incremental feature — skip discovery:

/feature-factory Add email notification when an invoice is overdue by 7 days

Type feature when prompted. Skips to Codebase Researcher. 9 agents, 3 checkpoints.

Revise during discovery without restarting:

revise market    → reruns Market Researcher → PM → UX Designer
revise product   → reruns PM → UX Designer only
revise ux        → reruns UX Designer only

Standalone pipeline:

/devops-pipeline
/devops-pipeline add staging environment with manual approval gate
/devops-pipeline fix broken deploy to prod

Vibe Coding vs. Software Factory¶

Vibe coding	Software Factory
Problem validation skipped	Market Researcher validates the problem before scope is defined
Scope invented during coding	PM defines MVP with MoSCoW — Must Haves capped at 5
UI improvised by the developer	UX Designer produces wireframes before the spec is written
One agent, unlimited scope	Up to 12 agents, hard scope, clean context per step
Requirements implicit in the prompt	Requirements are explicit, human-approved documents
Security is an afterthought	STRIDE threat model before any builder starts
"Done" means it compiled	"Done" means every acceptance criterion passes
Secrets in git history	Pre-commit hook blocks secrets at the filesystem
Deployment manual or forgotten	CI/CD pipeline built as part of the feature
Human reviews the PR cold	Human approves market, scope, and UX before any code

The factory doesn't make you faster on the first feature. It makes you faster on the tenth — because the ninth didn't ship the wrong product or wire a UI nobody wanted.

Summary¶

Discovery mode (12 agents, 4 checkpoints) — new products and major initiatives:

Market Researcher — validates the problem, segments, and competitors. No invented data.
Product Manager — MoSCoW scope with Must Have ≤ 5, user journey, success metrics.
UX Designer — page inventory, flows, ASCII wireframes, component inventory.
Checkpoint 1 — approve problem + scope + UX before engineering starts.

Feature mode (9 agents, 3 checkpoints) — incremental work on an existing product:

Codebase Researcher — maps the codebase. No guessing.
Story Writer → Spec Writer — approved requirements and technical blueprint. Two catches before code.
Security Architect — STRIDE threat model on the design.
Backend Builder → Frontend Builder — strict scope isolation; Frontend Builder implements the wireframes directly.
Test Verifier — every acceptance criterion passes mechanically.
Implementation Validator — compares code against story, brief, security controls, and UX Design Document.
DevOps Engineer — multi-environment pipeline; build once, promote everywhere.
Checkpoint (PR review) — diff, test results, validator report, secrets manifest.

Validated before designed. Designed before built. Secured before shipped. Tested against its own requirements. Deployed through a repeatable pipeline. All from one command.

Source: pkhamdee/software-factory-agent — updated 2026-05-28 with Market Researcher, Product Manager, and UX Designer agents; two-mode pipeline (discovery / feature).

Discussion

Have thoughts on this post? Share them below — questions, corrections, or your own experience are all welcome.