Skip to main content

QA Agent

Overview

FieldValue
Agent IDqa-agent
SDLC StageStage 4: Testing and Quality Assurance
Human OwnerQA Lead
Role GuideQA Lead Guide
Prompt Templateprompt-library/by-role/qa-lead/release-readiness-risk-decision.md
Contract Version1.0.0
StatusActive

What This Agent Does

The qa-agent validates that implemented code meets acceptance criteria, identifies regression risk, and produces a release readiness recommendation. It operates in parallel with the security-agent after Gate 3.

Core responsibilities:

  1. Risk-based test matrix generation — Produce a test matrix prioritized by risk using the template from prompt-library/by-use-case/test-generation/risk-based-test-matrix.md
  2. Acceptance criteria validation — Verify every acceptance criterion has a corresponding test
  3. Integration and E2E test execution — Run tests beyond unit level to validate system behavior
  4. Regression risk assessment — Identify areas of the codebase that may break due to the change
  5. Release readiness recommendation — Issue PASS, CONDITIONAL, or FAIL with specific blocking defects
  6. Defect classification — Categorize found defects by severity and type for trend analysis

Agent Contract

agent_id: qa-agent
contract_version: 1.0.0
role_owner: qa-lead

allowed_inputs:
- code-patch-with-tests
- acceptance-criteria
- defect-history
- test-coverage-baselines
- regression-test-suite

allowed_outputs:
- risk-based-test-matrix
- test-results
- coverage-report
- regression-analysis
- release-readiness-recommendation
- defect-classifications

forbidden_actions:
- approve-own-test-changes # Cannot self-approve; violates separation of duties
- skip-test-categories # All test categories must be executed
- modify-source-code # QA agent does not write production code
- override-coverage-thresholds # Coverage minimums are non-negotiable
- mark-failing-tests-as-skip # Failing tests must be fixed, not skipped

required_checks:
- acceptance-criteria-coverage-complete
- regression-suite-executed
- coverage-threshold-met
- no-critical-test-failures

handoff_targets:
- agent: devmgr-agent
artifact: test-results-and-recommendation
condition: testing-complete
- agent: developer-agent
artifact: failure-details
condition: code-defect-found # Rework routing

escalation_path:
approver_role: qa-lead
triggers:
- conditional-release-recommendation
- fail-release-recommendation
- coverage-below-threshold
- flaky-test-detected

System Prompt Blueprint

You are qa-agent for [PROJECT_NAME].

Your role: Validate code against acceptance criteria, assess regression
risk, and issue a release readiness recommendation.

Contract boundaries:
- You MUST NOT modify source code
- You MUST NOT skip any test category
- You MUST NOT approve your own test changes
- You MUST NOT mark failing tests as skipped
- You MUST cover every acceptance criterion with at least one test

For every code patch you receive, produce:
1. Risk-based test matrix (prioritized by blast radius and change proximity)
2. Test execution results (pass/fail with details)
3. Coverage report against project threshold
4. Regression analysis (impacted areas, risk level)
5. Release readiness recommendation:
- PASS: All criteria met, no blocking defects
- CONDITIONAL: Minor issues, can proceed with documented exceptions
- FAIL: Blocking defects found, must fix before proceeding

When issuing CONDITIONAL or FAIL, escalate to the human QA Lead.

Reference: prompt-library/by-role/qa-lead/release-readiness-risk-decision.md
Standards: PRD-STD-003 (Testing), PRD-STD-007 (Quality Gates)

Handoff Specifications

Receives From (Upstream)

SourceArtifactTrigger
developer-agentCode patch with unit tests and implementation notesGate 3 passed

Sends To (Downstream)

TargetArtifactCondition
devmgr-agentTest results with release readiness recommendationTesting complete
developer-agent (rework)Failure details with expected vs actual behaviorCode defect found

Gate Responsibilities

This agent co-owns Gate 4 with devmgr-agent:

CriterionHow This Agent Satisfies It
All acceptance criteria have corresponding testsRisk-based test matrix maps criteria to tests
Test coverage meets minimum thresholdCoverage report against project baseline
No critical or high-severity test failuresTest execution results with severity classification
Regression analysis completedRegression risk assessment with impacted areas
Release readiness recommendation issuedPASS / CONDITIONAL / FAIL with justification

Trust Level Progression

LevelDurationWhat Changes
Level 02 weeks / 20 runsQA Lead reviews every test matrix and recommendation
Level 15 weeks / 50 runsPASS recommendations auto-proceed for Tier 1; CONDITIONAL/FAIL always escalate
Level 210 weeks / 100 runsPASS auto-proceeds for Tier 1-2; statistical sampling of test quality
Level 3OngoingPASS auto-proceeds for Tier 1-3; human reviews Tier 4 and CONDITIONAL/FAIL

Environment Scope

EnvironmentAccessAllowed Actions
DevelopmentNoneDoes not operate in Development
StagingFullRun tests, generate matrices, classify defects
ProductionNoneDoes not operate in Production

Implementation Guide

Step 1: Define Your Test Categories

test_categories:
- unit: "Already provided by developer-agent"
- integration: "API contract tests, service-to-service"
- e2e: "Full user journey tests"
- regression: "Existing test suite for impacted areas"
- performance: "Load tests for performance-critical paths"

Step 2: Configure Coverage Thresholds

coverage_thresholds:
overall: 80%
new_code: 90%
critical_paths: 95% # Auth, payment, data handling

Step 3: Set Up the Risk-Based Matrix

The qa-agent prioritizes tests based on:

  • Change proximity — Files directly modified get highest priority
  • Blast radius — Components with many dependents get higher priority
  • Historical defect density — Areas with past defects get extra coverage
  • Risk tier — Higher tier = more thorough testing

Step 4: Configure Parallel Execution

The qa-agent runs in parallel with security-agent. Both receive the same code patch from developer-agent. The orchestrator waits for both to complete before evaluating Gate 4+5.

Known Limitations

  • Test generation quality — AI-generated tests may test implementation details rather than behavior. Review test assertions carefully.
  • Flaky test detection — The agent may flag legitimate issues as flaky, or miss genuinely flaky tests.
  • Performance testing scope — The agent can generate performance test outlines but cannot execute meaningful load tests without infrastructure.
  • Cannot detect visual regressions — UI testing requires specialized tools beyond the agent's scope.

Standards Compliance

StandardRequirementEvidence This Agent Produces
PRD-STD-003Testing requirementsRisk-based test matrix, coverage report, test results
PRD-STD-007Quality gate enforcementRelease readiness recommendation with gate criteria
PRD-STD-009Agent governanceContract, run records, handoff artifacts