Metrics Pipeline Setup
git clone https://github.com/AEEF-AI/aeef-transform.git
The metrics pipeline automates the collection and validation of AEEF Key Performance Indicators (KPIs). It captures data from CI runs, code reviews, and security scans, then structures that data into schema-validated records suitable for dashboard visualization and trend analysis.
For the full list of metric definitions, see the KPI Framework.
KPI Schema Overview
All metrics records conform to the kpi-record.schema.json schema:
{
"$schema": "https://json-schema.org/draft/2020-12/schema",
"type": "object",
"required": ["recordId", "timestamp", "period", "category", "metrics"],
"properties": {
"recordId": { "type": "string", "format": "uuid" },
"timestamp": { "type": "string", "format": "date-time" },
"period": {
"type": "object",
"properties": {
"start": { "type": "string", "format": "date" },
"end": { "type": "string", "format": "date" },
"cadence": { "enum": ["daily", "weekly", "sprint", "monthly"] }
}
},
"category": {
"enum": ["productivity", "risk", "financial"]
},
"metrics": {
"type": "array",
"items": {
"type": "object",
"required": ["name", "value", "unit"],
"properties": {
"name": { "type": "string" },
"value": { "type": "number" },
"unit": { "type": "string" },
"target": { "type": "number" },
"threshold": { "type": "number" }
}
}
},
"source": {
"type": "object",
"properties": {
"repository": { "type": "string" },
"pipeline": { "type": "string" },
"collector": { "type": "string" }
}
}
}
}
Provenance Record Generation
Every CI pipeline run generates a provenance record that captures the full build context. This record is the foundational data source for KPI aggregation.
What a Provenance Record Contains
- Build metadata -- Repository, branch, commit SHA, build ID, timestamp
- Stage outcomes -- Pass/fail status and duration for each pipeline stage
- Coverage data -- Line coverage, branch coverage, mutation score
- Security findings -- SAST finding count, SCA vulnerability count
- AI contribution -- Whether AI tools were disclosed, which tools were used
- Dependency snapshot -- Package count, license distribution
Generation Script
The provenance generator runs as a post-pipeline step:
# TypeScript
node scripts/generate-provenance.js --output provenance/
# Python
uv run python scripts/generate_provenance.py --output provenance/
# Go
go run scripts/generate_provenance.go --output provenance/
Generated records are stored in the provenance/ directory and validated against the schema in CI.
Data Collection Scripts
Weekly collection scripts aggregate provenance records into KPI measurements:
scripts/collect-kpis.sh
#!/usr/bin/env bash
set -euo pipefail
PERIOD_START="${1:?Usage: collect-kpis.sh <start-date> <end-date>}"
PERIOD_END="${2:?Usage: collect-kpis.sh <start-date> <end-date>}"
OUTPUT_DIR="metrics"
mkdir -p "$OUTPUT_DIR"
# Collect productivity metrics
node scripts/aggregate-productivity.js \
--start "$PERIOD_START" \
--end "$PERIOD_END" \
--provenance-dir provenance/ \
--output "$OUTPUT_DIR/productivity-${PERIOD_START}.json"
# Collect risk metrics
node scripts/aggregate-risk.js \
--start "$PERIOD_START" \
--end "$PERIOD_END" \
--provenance-dir provenance/ \
--output "$OUTPUT_DIR/risk-${PERIOD_START}.json"
# Validate all outputs
for metric_file in "$OUTPUT_DIR"/*.json; do
npx ajv validate -s schemas/kpi-record.schema.json -d "$metric_file"
done
echo "KPI collection complete for period $PERIOD_START to $PERIOD_END"
Key Metrics Collected
| Category | Metric | Source | Unit |
|---|---|---|---|
| Productivity | AI contribution ratio | Provenance records (AI disclosure flags) | Percentage |
| Productivity | PR cycle time | Git/GitHub API | Hours |
| Productivity | Deployment frequency | CI pipeline runs to main | Per week |
| Risk | Defect density | Test failure rate across provenance records | Defects per KLOC |
| Risk | Security scan pass rate | SAST/SCA stage outcomes | Percentage |
| Risk | Mutation score trend | Mutation testing stage outcomes | Percentage |
| Financial | AI tool cost per developer | Billing API integration | USD per month |
| Financial | Time saved estimate | PR cycle time delta vs baseline | Hours per sprint |
Dashboard Integration Options
The metrics pipeline produces JSON records that can feed into various visualization tools:
Option 1: Grafana (Recommended for Production Tier)
Export KPI records to a time-series database (InfluxDB or Prometheus) and build Grafana dashboards. The Production tier includes pre-built dashboard definitions.
Option 2: GitHub Actions Summary
For simpler setups, render KPI summaries directly in GitHub Actions job summaries:
- name: KPI Summary
run: |
node scripts/render-kpi-summary.js --input metrics/ >> $GITHUB_STEP_SUMMARY
Option 3: Static Site
Generate a static HTML report from KPI records and publish it as a GitHub Pages site or artifact:
node scripts/render-kpi-report.js --input metrics/ --output reports/kpi-report.html
Option 4: CSV Export
Export KPI records as CSV for import into spreadsheets or BI tools:
node scripts/export-csv.js --input metrics/ --output reports/kpis.csv
Scheduling Collection
Set up a weekly cron job in GitHub Actions to run the collection automatically:
name: Weekly KPI Collection
on:
schedule:
- cron: '0 9 * * 1' # Every Monday at 9am UTC
jobs:
collect:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- run: ./scripts/collect-kpis.sh "$(date -d '7 days ago' +%Y-%m-%d)" "$(date +%Y-%m-%d)"
- uses: actions/upload-artifact@v4
with:
name: kpi-records
path: metrics/
Next Steps
- KPI definitions: KPI Framework for the full metric catalog
- Productivity metrics: Productivity Metrics for detailed measurement guidance
- Production monitoring: Monitoring Setup for Grafana dashboards and alerting