AI Agent SDLC Orchestration
This guide defines how AI agents collaborate across the complete software development lifecycle to deliver production-ready software with structured human checkpoints. It connects every role guide, prompt template, and production standard in the AEEF into a single, executable pipeline.
Canonical location note: This is the canonical orchestration model used by the Transformation track, the Production tier reference implementation, and the AEEF CLI wrapper. It lives under
/transformation/because it defines the operating model progression, but it applies through production and operations.
The Problem This Solves
Your AEEF framework has 12 role guides, 16 production standards, 11 role-specific prompts, and a 6-stage operating model — but no single document that answers: "How do all the agents actually work together, end-to-end, from a business idea to running production code?"
This guide is that document.
What You Will Get
| Deliverable | Description |
|---|---|
| Agent Registry | 11 purpose-built agents mapped to SDLC stages with defined contracts |
| 7-Stage Pipeline | Complete SDLC from requirements to operations with agent ownership at each stage |
| Stage Gates | Pass/fail criteria at every transition with automated and human checkpoints |
| Trust Model | Progressive automation levels that reduce human review as agents earn trust |
| Environment Promotion | Dev to Staging to Production with agent responsibilities at each environment |
| Orchestration Rules | State machine, failure routing, escalation paths, and deadlock resolution |
| Skill Registry & Multi-Agent Gate Patterns | Skill catalog enforcement, skill approval checks, and gate binding in orchestrated agent workflows |
The 11-Agent Pipeline
Every agent maps to an existing AEEF role guide and prompt template. No agent exists without a human owner.
┌─────────────────────────────────────────────────────────────────────────────────┐
│ AI AGENT SDLC ORCHESTRATION PIPELINE │
│ │
│ STAGE 1 STAGE 2 STAGE 3 STAGE 4 STAGE 5 │
│ REQUIREMENTS DESIGN IMPLEMENTATION TESTING SECURITY │
│ │
│ ┌───────────┐ ┌───────────┐ ┌───────────┐ ┌───────────┐ ┌───────────┐ │
│ │ product- │ │ architect-│ │ developer-│ │ qa-agent │ │ security- │ │
│ │ agent │─>│ agent │──>│ agent │──>│ │─>│ agent │ │
│ └───────────┘ └───────────┘ └───────────┘ └───────────┘ └───────────┘ │
│ │ │ │ │
│ v v v │
│ ┌───────────┐ ┌───────────┐ ┌───────────┐ │
│ │ scrum- │ │ devmgr- │ │compliance-│ │
│ │ agent │ │ agent │ │ agent │ │
│ └───────────┘ └───────────┘ └───────────┘ │
│ │
│ STAGE 6 STAGE 7 │
│ DEPLOYMENT OPERATIONS │
│ │
│ ┌───────────┐ ┌───────────┐ ┌───────────┐ │
│ │ platform- │───>│ ops- │───>│ executive-│ │
│ │ agent │ │ agent │ │ agent │ │
│ └───────────┘ └───────────┘ └───────────┘ │
│ │ │ │ │
│ │ └────────────────┘ │
│ │ │ │
│ │ FEEDBACK TO │
│ │ STAGE 1 │
│ └─── HUMAN APPROVAL GATE ───┘ │
└─────────────────────────────────────────────────────────────────────────────────┘
Agent Registry
| # | Agent ID | SDLC Stage | Human Owner | AEEF Role Guide | Prompt Template | Cannot Do |
|---|---|---|---|---|---|---|
| 1 | product-agent | Requirements | Product Manager | PM Guide | Story Hardening | Edit code, approve architecture |
| 2 | scrum-agent | Planning | Scrum Master | SM Guide | Sprint Capacity & Risk Calibration | Override capacity, skip risk flags |
| 3 | architect-agent | Design | Solution Architect | SA Guide | Architecture Conformance & Agent Handoffs | Approve its own designs, bypass governance |
| 4 | developer-agent | Implementation | Senior Developer | Dev Guide | Feature Implementation | Merge to main, disable CI, introduce secrets |
| 5 | qa-agent | Testing | QA Lead | QA Guide | Release Readiness Risk Decision | Approve its own test changes, skip test categories |
| 6 | security-agent | Security | Security Engineer | SecEng Guide | Security Review & Remediation | Bypass scan failures, waive critical findings |
| 7 | compliance-agent | Compliance | Compliance Officer | CO Guide | Audit Evidence Request | Grant waivers without human approval |
| 8 | platform-agent | Deployment | Platform Engineer | PE Guide | CI Quality Gates | Deploy to production without approval |
| 9 | devmgr-agent | Quality Oversight | Dev Manager | DM Guide | Quality, Risk & Enablement Plan | Override security findings, skip audit evidence |
| 10 | ops-agent | Operations | Platform Engineer | PE Guide | — | Rollback without incident record |
| 11 | executive-agent | Reporting | CTO / Executive | CTO Guide | Architecture Governance Decision Pack | Make implementation decisions |
The 7-Stage Pipeline
Each stage has a defined owner agent, supporting agents, inputs, outputs, and gate criteria. No stage can be skipped. The rigor scales with the work item size per the Operating Model.
Stage 1: Requirements and Planning
Owner: product-agent + scrum-agent
| Aspect | Details |
|---|---|
| Trigger | Business requirement, user story, or technical initiative |
| product-agent actions | Refine story with acceptance criteria, identify non-functional requirements, assign risk tier (Tier 1-4), classify data sensitivity |
| scrum-agent actions | Estimate complexity using two-dimensional model (complexity + AI acceleration factor), assess sprint capacity impact, flag impediments |
| Inputs | Raw requirement, backlog context, team velocity history, architecture constraints |
| Outputs | Hardened user story, risk tier classification, sprint capacity assessment, acceptance criteria |
| Human checkpoint | Product Owner approves scope and risk tier |
| Standards enforced | PRD-STD-001 (prompt structure), PRD-STD-009 (agent contracts) |
Gate 1 criteria:
- Story has measurable acceptance criteria
- Risk tier assigned (Tier 1-4)
- Data classification applied (Public/Internal/Confidential/Restricted)
- Sprint capacity assessed with confidence range
- Product Owner sign-off recorded
Stage 2: Architecture and Design
Owner: architect-agent
| Aspect | Details |
|---|---|
| Trigger | Gate 1 passed — approved story with risk tier |
| architect-agent actions | Validate against reference architecture, check boundary constraints, identify integration points, assess agent handoff compatibility, propose design approach |
| Inputs | Hardened story, existing architecture documentation, technology constraints, PRD-STD-009 agent contracts |
| Outputs | Architecture conformance assessment, design proposal, constraint list, integration map |
| Human checkpoint | Solution Architect approves for Tier 2+ work; auto-approve for Tier 1 within established patterns |
| Standards enforced | PRD-STD-007 (quality gates), PRD-STD-009 (agent governance) |
Gate 2 criteria:
- Design conforms to reference architecture
- No unauthorized boundary crossings
- Integration points identified and documented
- Human architect approval for Tier 2+ changes
Stage 3: Implementation
Owner: developer-agent
| Aspect | Details |
|---|---|
| Trigger | Gate 2 passed — approved design with constraints |
| developer-agent actions | Generate implementation following language/framework conventions, produce unit tests, create implementation notes, flag assumptions and risks |
| Inputs | Approved design, architecture constraints, language-specific prompt template (Python, TypeScript, Go, Java), framework template (Next.js, React, Express, FastAPI, Django, Spring Boot) — see prompt-library/by-language/ and prompt-library/by-framework/ |
| Outputs | Code patch, unit tests, implementation notes, AI attribution metadata |
| Human checkpoint | Developer reviews generated code for logic correctness (required at all trust levels) |
| Standards enforced | PRD-STD-001 (prompt engineering), PRD-STD-002 (code review), PRD-STD-003 (testing) |
Gate 3 criteria:
- Code compiles and passes lint
- Unit tests written and passing
- AI attribution metadata present (
AI-Usage,AI-Prompt-Ref,Agent-IDs) - Implementation notes document assumptions and risks
- No Restricted data in prompts or outputs
Stage 4: Testing and Quality Assurance
Owner: qa-agent + devmgr-agent
| Aspect | Details |
|---|---|
| Trigger | Gate 3 passed — code with passing unit tests |
| qa-agent actions | Generate risk-based test matrix, execute integration/E2E tests, validate acceptance criteria coverage, identify regression risk, produce release readiness recommendation (PASS/CONDITIONAL/FAIL) |
| devmgr-agent actions | Assess quality metrics against team baselines, flag trend deviations, validate evidence completeness |
| Inputs | Code patch, unit test results, acceptance criteria, defect history, risk-based test matrix template (see prompt-library/by-use-case/test-generation/risk-based-test-matrix.md) |
| Outputs | Test results, coverage report, regression analysis, release readiness decision, quality metrics assessment |
| Human checkpoint | QA Lead reviews CONDITIONAL or FAIL decisions; PASS auto-proceeds at Trust Level 3+ |
| Standards enforced | PRD-STD-003 (testing), PRD-STD-007 (quality gates) |
Gate 4 criteria:
- All acceptance criteria have corresponding tests
- Test coverage meets minimum threshold (per project config)
- No critical or high-severity test failures
- Regression analysis completed
- Release readiness recommendation issued
Stage 5: Security and Compliance
Owner: security-agent + compliance-agent
| Aspect | Details |
|---|---|
| Trigger | Gate 4 passed — tested code with release readiness recommendation |
| security-agent actions | Run SAST/DAST/SCA scans, threat model AI-specific attack surfaces, classify findings by severity, produce merge-blocking decision |
| compliance-agent actions | Verify audit trail completeness, check license compliance, validate data classification adherence, collect governance evidence |
| Inputs | Code patch, test results, dependency manifest, secure coding review template (see prompt-library/templates/system-prompts/secure-coding-review.md), dependency risk check template (see prompt-library/by-use-case/dependency-compliance/dependency-risk-check.md) |
| Outputs | Security scan results, threat model, finding classifications, compliance evidence pack, merge decision |
| Human checkpoint | Security Engineer reviews critical/high findings; Compliance Officer reviews Tier 3+ governance evidence |
| Standards enforced | PRD-STD-004 (security), PRD-STD-008 (dependencies), PRD-STD-005 (documentation) |
Gate 5 criteria:
- SAST scan completed with no unresolved critical findings
- Dependency scan passed (no known critical CVEs, license compliant)
- Threat model reviewed for AI-specific attack surfaces
- Audit trail complete (agent run records, handoff artifacts)
- Compliance evidence pack assembled
- Human security sign-off for critical/high findings
Stage 6: Deployment and Release
Owner: platform-agent
| Aspect | Details |
|---|---|
| Trigger | Gate 5 passed — security-cleared, compliance-approved code |
| platform-agent actions | Validate CI/CD pipeline compatibility, prepare deployment manifest, configure canary/feature flags, activate monitoring, execute deployment |
| Inputs | Approved code, deployment configuration, monitoring thresholds, rollback plan |
| Outputs | Deployment manifest, canary configuration, monitoring dashboard, health check results |
| Human checkpoint | Mandatory human approval before production deployment (all trust levels) |
| Standards enforced | PRD-STD-007 (quality gates), PRD-STD-009 (agent governance) |
Gate 6 criteria:
- All prior gates (1-5) passed and evidence recorded
- Deployment plan reviewed
- Rollback procedure documented and tested
- Canary/feature flag configuration verified
- Monitoring and alerting active
- Human deployment approval recorded
Stage 7: Operations, Monitoring, and Feedback
Owner: ops-agent + executive-agent
| Aspect | Details |
|---|---|
| Trigger | Deployment complete — code running in production |
| ops-agent actions | Monitor health metrics, detect anomalies, trigger alerts, generate incident triage data, propose rollback if thresholds breached |
| executive-agent actions | Aggregate delivery metrics, produce board-ready summaries, calculate ROI impact, flag strategic risks |
| Inputs | Production metrics, error rates, performance data, business KPIs |
| Outputs | Health reports, incident triage data, rollback recommendations, executive dashboard, feedback for Stage 1 |
| Human checkpoint | Platform Engineer approves rollback decisions; CTO reviews executive reports |
| Standards enforced | PRD-STD-006 (technical debt tracking), PRD-STD-007 (quality gates) |
Gate 7 criteria (feedback loop):
- Post-deployment health check passed (15min, 1hr, 24hr windows)
- No critical incidents within monitoring window
- Business metrics tracking against success criteria
- Lessons learned captured and fed back to Stage 1
Stage Gate Matrix
Summary of all gates, who decides, and what happens on failure:
| Gate | Decision Maker | Auto-Pass Condition | Failure Routing | Max Resolution Time |
|---|---|---|---|---|
| Gate 1 (Requirements) | Product Owner | Never — always requires human | Back to product-agent for refinement | 1 business day |
| Gate 2 (Design) | Architect (human) for Tier 2+ | Tier 1 within established patterns | Back to architect-agent with feedback | 2 business days |
| Gate 3 (Implementation) | Automated CI + Developer review | Lint + tests pass + attribution present | Back to developer-agent with failure details | Same sprint |
| Gate 4 (Testing) | QA Lead for CONDITIONAL/FAIL | PASS at Trust Level 3+ | Back to developer-agent (code fix) or qa-agent (test gap) | Same sprint |
| Gate 5 (Security) | Security Engineer for critical/high | No critical/high findings + clean dependency scan | Back to developer-agent (remediation) | Per vulnerability SLAs |
| Gate 6 (Deployment) | Human approver (always) | Never — always requires human | Back to platform-agent (config fix) or prior stage | 1 business day |
| Gate 7 (Operations) | Platform Engineer | Health checks pass within all windows | Rollback + incident triage | Per incident severity |
The Handoff Protocol
Every agent-to-agent transition uses the same structured handoff artifact. This is mandatory per PRD-STD-009 REQ-009-06.
handoff:
id: "HO-{source-agent}-{target-agent}-{timestamp}"
source_agent: "{agent-id}"
target_agent: "{agent-id}"
stage_from: "{stage number}"
stage_to: "{stage number}"
artifacts:
- type: "{artifact type}"
path: "{file or URL reference}"
hash: "{SHA-256}"
summary: "{one-paragraph description of what was done}"
assumptions:
- "{assumption 1}"
- "{assumption 2}"
risks:
- severity: "{critical|high|medium|low}"
description: "{risk description}"
decision_request: "{what the target agent/human needs to decide}"
metadata:
prompt_ref: "{prompt template used}"
model_version: "{AI model version}"
run_duration: "{seconds}"
iteration_count: "{number of agent loops}"
Complete Handoff Chain
product-agent ──[HO-01]──> scrum-agent ──[HO-02]──> architect-agent
│
[HO-03]
│
v
developer-agent
│
┌──────[HO-04]──────┐
│ │
v v
qa-agent security-agent
│ │
[HO-05] [HO-06]
│ │
v v
devmgr-agent compliance-agent
│ │
└────────┬───────────┘
│
[HO-07]
│
v
platform-agent
│
┌── HUMAN GATE ──┐
│ │
v v
APPROVED REJECTED
│ (routes back)
v
ops-agent ──[HO-08]──> executive-agent
│
[HO-09: FEEDBACK]
│
v
product-agent (next cycle)
Minimum Human Review Points
The pipeline is designed to minimize human review while maintaining safety. These are the non-negotiable human checkpoints:
| Checkpoint | Who | Why | Can It Be Automated? |
|---|---|---|---|
| Requirements approval | Product Owner | Business intent must be human-owned | Never |
| Architecture approval (Tier 2+) | Solution Architect | Architecture decisions are irreversible at scale | Only for Tier 1 within established patterns |
| Code logic review | Developer | AI hallucinations in business logic are undetectable by scans | Reducible with trust model (see below) |
| Critical security findings | Security Engineer | False negatives in security have outsized impact | Never for critical/high severity |
| Production deployment | Release approver | Production is the last line of defense | Never |
Everything else is automatable. The Trust Model defines how to progressively automate the remaining checkpoints.
How This Connects to Your Existing Framework
| Existing AEEF Asset | How This Guide Uses It |
|---|---|
| Operating Model Lifecycle | Stages 1-7 map to the 6-stage operating model with expanded agent ownership |
| 12 Role Guides | Each agent's behavior is defined by its corresponding role guide |
11 Role Prompts (prompt-library/by-role/) | Each agent executes using its role-specific prompt template |
| 17 PRD-STD Standards | Gate criteria enforce specific standards at each transition |
| Agent Contracts (PRD-STD-009) | Every agent operates under a formal contract |
| Skills Catalog Governance (PRD-STD-017) | Skill registry, role/environment skill gates, and community skill attribution controls |
| OpenClaw for AEEF Agent Orchestration | Gateway-style execution surface for multi-agent workflows with AEEF overlays for skills, gates, and approvals |
| Small-Team Multi-Agent Starter | This guide scales the starter from 4 agents to 11 for full lifecycle coverage |
| 5 Pillars | Pipeline enforces all 5 pillars through stage-specific gates |
Quick Start
- Read the Trust Model to understand your starting automation level
- Deploy the 4-agent starter from Small-Team Multi-Agent Starter
- Expand to 7 agents (add
architect-agent,compliance-agent,devmgr-agent) after 2 weeks of stable metrics - Scale to full 11-agent pipeline after demonstrating consistent Gate 1-5 pass rates above 80%
- Implement Environment Promotion rules for multi-environment deployments
- Configure Skill Registry & Multi-Agent Gate Patterns before enabling broad skill usage in developer/agent workflows
- Pilot OpenClaw for AEEF Agent Orchestration as an execution surface (sandboxed, gated, non-production first)
- Configure the Orchestration Rules state machine for automated pipeline execution
This guide builds on top of all existing AEEF content. It does not replace any existing standard, role guide, or prompt template. It orchestrates them into a single executable pipeline.