Agent-Driven Software Factory: Replace Vibe Coding With a 12-Agent Pipeline¶
Vibe coding is not a software development process.
Fire a prompt, accept the result, ship it. Fast in the same way skipping tests is fast — until something breaks and you can't explain why. The output is plausible-looking code with no validated problem, no verified requirements, no threat model, no acceptance tests, no deployment strategy. One feature. Eleven things that were supposed to happen but didn't.
The Software Factory Agent replaces that with a 12-agent pipeline. One developer runs the chain. Specialized agents handle market research, product scoping, UX design, security, implementation, testing, validation, and deployment — each in its own clean context window with only the tools it needs.
Two modes. Discovery mode (12 agents) validates the market, defines the product, and designs the UX before any engineering starts. Feature mode (9 agents) skips straight to implementation for incremental work on an existing product.
Two Modes, One Pipeline¶
DISCOVERY MODE
FEATURE MODE
The 12 Agents¶
Discovery Agents (discovery mode only)¶
Agent 1: Market Researcher¶
Tools: Read (read-only)
Never invents data. The Market Researcher runs before scope, before design, before code. If information is missing, it surfaces open questions — it does not guess.
Six-section Market Research Report:
- Problem statement — who has this problem, when it occurs, what it costs to leave unsolved
- Target customer segments — primary and secondary personas; current workaround and why it falls short
- Competitor landscape — direct competitors, indirect substitutes, status quo (doing nothing)
- Top 3–5 use cases — actor, trigger, goal, current friction
- Market risks — nice-to-have vs. must-have, size, regulation, timing
- Open questions — anything blocking the Product Manager
Every downstream agent builds on this report. Neither the PM nor the UX Designer invents customer data.
Agent 2: Product Manager¶
Tools: Read (read-only)
Must Have features are capped at 5. More than 5 means the scope is too wide — the PM cuts harder.
Six-section Product Brief:
- MVP scope statement — what the MVP does and explicitly does not do
- Core features (MoSCoW) — Must Have / Should Have / Could Have / Won't Have in MVP
- Primary user journey — acquisition through value realisation, with drop-off risks per stage
- Pricing model — one recommendation with rationale and risks
- Success metrics — 3–5 outcomes; at least one measures actual user value, not vanity
- Open questions and blockers — what must be resolved before UX or engineering starts
Every Must Have feature must trace to a use case in the Market Research Report. If it can't, it doesn't belong in the MVP.
Agent 3: UX Designer¶
Tools: Read (read-only)
ASCII wireframes, not descriptions. The UX Designer produces a UX Design Document the Frontend Builder implements directly — no interpretation gap between what was designed and what was built.
Six-section UX Design Document:
- Page inventory — every screen: route, purpose, who sees it; includes error and empty states
- Navigation structure — top-level and contextual nav as an indented hierarchy
- User flows — happy path plus the two most likely error paths per core use case
- ASCII wireframes — layout and element hierarchy for the 3–5 most important screens
- Dashboard and report layouts — metrics shown, card layout, empty states, filter/sort behaviour
- Component inventory — every reusable component with states and interaction notes
The UX Design Document feeds the Spec Writer (source of truth for frontend scope) and the Frontend Builder (direct implementation input). The UI is designed once, not improvised.
Human checkpoint 1: discovery approval. Combined review of all three discovery outputs — market problem, MVP scope, MoSCoW list, UX wireframes. Engineering doesn't start until you approve.
Targeted revision loops — no need to restart the chain:
revise market— reruns Market Researcher → PM → UX Designerrevise product— reruns PM → UX Designer onlyrevise ux— reruns UX Designer only
Engineering Agents (both modes)¶
Agent 4: Codebase Researcher¶
Tools: Read, Grep, Glob (read-only)
No guessing. Runs first in feature mode; runs after discovery approval in discovery mode. In discovery mode it uses the PM's refined scope to focus the research.
Produces:
- Relevant files and patterns for the feature
- Existing code that does similar things — so builders don't reinvent
- Risk flags: multi-tenancy, timezone handling, auth boundaries, retry behaviour
- Tests that need updating once the feature ships
Feeds directly into the Story Writer and Spec Writer.
Agent 5: Story Writer¶
Tools: Read (read-only)
Turns the Codebase Researcher's findings into a precise user story the whole chain builds against. In discovery mode, the Product Brief and UX Design Document are the scope authority — no invented requirements.
Output:
- User story — As a [role], I want [capability] so that [outcome]
- Acceptance criteria — each binary: pass or fail
- Edge cases — empty inputs, concurrent requests, permission failures
- Out-of-scope items — explicit list
- Open questions — anything the Researcher couldn't answer
Human checkpoint 2 in discovery mode / checkpoint 1 in feature mode. Read the story, check the criteria, approve or revise.
Agent 6: Spec Writer¶
Tools: Read, Grep, Glob (read-only)
Translates the approved story into a technical blueprint. In discovery mode, the UX Design Document is the source of truth for frontend scope — the Spec Writer does not redesign the UI.
Output:
- Data model — tables, columns, relationships, index strategy
- API contract — endpoints, request/response shapes, status codes, error formats
- Frontend requirements — component tree, state, API consumption (from UX Design Document)
- Background jobs — trigger conditions, retry policy, failure behaviour
- Complete file list — every file to create or modify, with its purpose
Human checkpoint 3 in discovery mode / checkpoint 2 in feature mode. A mistake in the data model costs five minutes here. It costs a weekend after migration runs.
Agent 7: Security Architect¶
Tools: Read, Grep, Glob (read-only)
STRIDE threat model on the approved brief — before the first builder starts.
| Category | Question |
|---|---|
| Spoofing | Can an attacker impersonate a user or system? |
| Tampering | Can data be modified in transit or at rest? |
| Repudiation | Can malicious actions be denied after the fact? |
| Information disclosure | Can sensitive data reach unauthorized parties? |
| Denial of service | Can the feature be made unavailable? |
| Elevation of privilege | Can a lower-privileged user gain higher access? |
Output: a mandatory security controls list every builder must implement. Critical findings pause the chain — redesign the brief or accept the risk with a recorded reason.
Agent 8: Backend Builder¶
Tools: Read, Edit, Write, Bash (backend files only)
Scope is enforced in the system prompt, not by convention. It cannot touch frontend files.
Builds:
- API routes and server actions
- Service layer
- Database migrations
- Background jobs
- Unit tests alongside the code — not after
Produces a file-change summary and implemented API contract for the Frontend Builder.
Agent 9: Frontend Builder¶
Tools: Read, Edit, Write, Bash (frontend files only)
In discovery mode: reads the Backend Builder's API summary and the UX Design Document. Implements the wireframes directly — no invented layouts, no improvised flows.
Builds:
- React components and pages matching the UX wireframes
- State management hooks
- Form handling and validation
- Component tests
If the backend is missing something, the Frontend Builder reports the gap. It does not cross the line and fix it.
Agent 10: Test Verifier¶
Tools: Read, Edit, Write, Bash (test files only)
Writes acceptance tests against every criterion in the approved user story — not unit tests, which the builders already wrote. Tests from the outside: what a real user does, what the system returns, what the database contains after.
Output: a pass/fail report per acceptance criterion. Any failure loops back to the responsible builder. No criterion is marked close enough.
Agent 11: Implementation Validator¶
Tools: Read, Grep, Glob (read-only)
Compares finished code against the user story, technical brief, and security controls list. In discovery mode, also checks against the UX Design Document.
Gap report by severity:
- Critical — blocks the PR; loop back to the relevant builder
- High — fix before merging
- Medium — file as follow-up
- Low / Informational — noted
The Validator never fixes anything. Its job is to tell the truth about whether implementation matches what was approved.
Agent 12: DevOps Engineer¶
Tools: Read, Edit, Write, Bash (pipeline files only)
Cannot touch application source code.
Builds the CI/CD pipeline: build, test, deploy across dev → test → qa → staging → prod, env var manifest, rollback strategy.
Build once, promote everywhere. One artifact promoted through every environment. Code is never rebuilt per environment. Staging runs exactly what production will run.
Human checkpoint 4 in discovery mode / checkpoint 3 in feature mode: full diff, acceptance test results, Validator gap report, secrets to provision.
Human Checkpoints¶
Discovery mode — 4 checkpoints¶
| Checkpoint | Trigger | What to review |
|---|---|---|
| 1. Discovery approval | After UX Designer | Market problem, MVP scope, MoSCoW list, wireframes |
| 2. Story approval | After Story Writer | Right problem? Acceptance criteria correct? |
| 3. Brief approval | After Spec Writer | Data model right? API shapes correct? |
| 4. PR review | After DevOps Engineer | Diff, test results, validator report, secrets |
Feature mode — 3 checkpoints¶
| Checkpoint | Trigger | What to review |
|---|---|---|
| 1. Story approval | After Story Writer | Right problem? Acceptance criteria correct? |
| 2. Brief approval | After Spec Writer | Data model right? API shapes correct? |
| 3. PR review | After DevOps Engineer | Diff, test results, validator report, secrets |
Plus one conditional checkpoint in both modes: Security Architect Critical finding → redesign or accept risk with a recorded reason.
The AI executes. You decide.
The Pre-Commit Secret Scanner¶
Blocks commits containing secrets before they reach the repository.
Blocked filenames: .env, .env.local, .env.production, .key, .pem, .p12, credentials.json, serviceAccountKey.json
Blocked patterns:
-----BEGIN RSA PRIVATE KEY-----
sk_live_ (Stripe live keys)
ghp_ (GitHub PATs)
AKIA (AWS access keys)
AIza (Google API keys)
xoxb- (Slack bot tokens)
Install once:
Design Principles¶
Discovery before code. Market Researcher, Product Manager, and UX Designer run before any engineering. No code is written against a vague idea.
One job per agent. Wrong assumptions surface at the point of origin — not six steps later in production.
Read-only gates. 7 of 12 agents cannot edit files. Every design decision is made on paper.
UX as a first-class input. Wireframes and component inventory feed the Spec Writer and Frontend Builder directly. The UI is designed once.
Hard scope enforcement. Backend Builder → backend only. Frontend Builder → frontend only. DevOps Engineer → pipeline only. Enforced in the system prompt.
Security at design time. STRIDE runs after the brief, before the builders. Fixing a threat model costs a paragraph. Fixing it in production costs a rewrite and a disclosure.
Getting Started¶
Step 1: Clone and configure¶
Fill in every <!-- TODO: --> marker in CLAUDE.md — stack, commands, directory layout, testing policy. The Codebase Researcher reads CLAUDE.md on every run. Incomplete context produces incomplete output.
Step 2: Install the git hook¶
Step 3: Choose your mode¶
New product — full discovery chain:
Type new when prompted. Runs all 12 agents with 4 human checkpoints.
Incremental feature — skip discovery:
Type feature when prompted. Skips to Codebase Researcher. 9 agents, 3 checkpoints.
Revise during discovery without restarting:
revise market → reruns Market Researcher → PM → UX Designer
revise product → reruns PM → UX Designer only
revise ux → reruns UX Designer only
Standalone pipeline:
/devops-pipeline
/devops-pipeline add staging environment with manual approval gate
/devops-pipeline fix broken deploy to prod
Vibe Coding vs. Software Factory¶
| Vibe coding | Software Factory |
|---|---|
| Problem validation skipped | Market Researcher validates the problem before scope is defined |
| Scope invented during coding | PM defines MVP with MoSCoW — Must Haves capped at 5 |
| UI improvised by the developer | UX Designer produces wireframes before the spec is written |
| One agent, unlimited scope | Up to 12 agents, hard scope, clean context per step |
| Requirements implicit in the prompt | Requirements are explicit, human-approved documents |
| Security is an afterthought | STRIDE threat model before any builder starts |
| "Done" means it compiled | "Done" means every acceptance criterion passes |
| Secrets in git history | Pre-commit hook blocks secrets at the filesystem |
| Deployment manual or forgotten | CI/CD pipeline built as part of the feature |
| Human reviews the PR cold | Human approves market, scope, and UX before any code |
The factory doesn't make you faster on the first feature. It makes you faster on the tenth — because the ninth didn't ship the wrong product or wire a UI nobody wanted.
Summary¶
Discovery mode (12 agents, 4 checkpoints) — new products and major initiatives:
- Market Researcher — validates the problem, segments, and competitors. No invented data.
- Product Manager — MoSCoW scope with Must Have ≤ 5, user journey, success metrics.
- UX Designer — page inventory, flows, ASCII wireframes, component inventory.
- Checkpoint 1 — approve problem + scope + UX before engineering starts.
Feature mode (9 agents, 3 checkpoints) — incremental work on an existing product:
- Codebase Researcher — maps the codebase. No guessing.
- Story Writer → Spec Writer — approved requirements and technical blueprint. Two catches before code.
- Security Architect — STRIDE threat model on the design.
- Backend Builder → Frontend Builder — strict scope isolation; Frontend Builder implements the wireframes directly.
- Test Verifier — every acceptance criterion passes mechanically.
- Implementation Validator — compares code against story, brief, security controls, and UX Design Document.
- DevOps Engineer — multi-environment pipeline; build once, promote everywhere.
- Checkpoint (PR review) — diff, test results, validator report, secrets manifest.
Validated before designed. Designed before built. Secured before shipped. Tested against its own requirements. Deployed through a repeatable pipeline. All from one command.
Source: pkhamdee/software-factory-agent — updated 2026-05-28 with Market Researcher, Product Manager, and UX Designer agents; two-mode pipeline (discovery / feature).
Discussion
Have thoughts on this post? Share them below — questions, corrections, or your own experience are all welcome.

