Skip to main content

Metrics Pipeline Setup

Open Repo Download ZIP

git clone https://github.com/AEEF-AI/aeef-transform.git

The metrics pipeline automates the collection and validation of AEEF Key Performance Indicators (KPIs). It captures data from CI runs, code reviews, and security scans, then structures that data into schema-validated records suitable for dashboard visualization and trend analysis.

For the full list of metric definitions, see the KPI Framework.

KPI Schema Overview

All metrics records conform to the kpi-record.schema.json schema:

{
"$schema": "https://json-schema.org/draft/2020-12/schema",
"type": "object",
"required": ["recordId", "timestamp", "period", "category", "metrics"],
"properties": {
"recordId": { "type": "string", "format": "uuid" },
"timestamp": { "type": "string", "format": "date-time" },
"period": {
"type": "object",
"properties": {
"start": { "type": "string", "format": "date" },
"end": { "type": "string", "format": "date" },
"cadence": { "enum": ["daily", "weekly", "sprint", "monthly"] }
}
},
"category": {
"enum": ["productivity", "risk", "financial"]
},
"metrics": {
"type": "array",
"items": {
"type": "object",
"required": ["name", "value", "unit"],
"properties": {
"name": { "type": "string" },
"value": { "type": "number" },
"unit": { "type": "string" },
"target": { "type": "number" },
"threshold": { "type": "number" }
}
}
},
"source": {
"type": "object",
"properties": {
"repository": { "type": "string" },
"pipeline": { "type": "string" },
"collector": { "type": "string" }
}
}
}
}

Provenance Record Generation

Every CI pipeline run generates a provenance record that captures the full build context. This record is the foundational data source for KPI aggregation.

What a Provenance Record Contains

  • Build metadata -- Repository, branch, commit SHA, build ID, timestamp
  • Stage outcomes -- Pass/fail status and duration for each pipeline stage
  • Coverage data -- Line coverage, branch coverage, mutation score
  • Security findings -- SAST finding count, SCA vulnerability count
  • AI contribution -- Whether AI tools were disclosed, which tools were used
  • Dependency snapshot -- Package count, license distribution

Generation Script

The provenance generator runs as a post-pipeline step:

# TypeScript
node scripts/generate-provenance.js --output provenance/

# Python
uv run python scripts/generate_provenance.py --output provenance/

# Go
go run scripts/generate_provenance.go --output provenance/

Generated records are stored in the provenance/ directory and validated against the schema in CI.

Data Collection Scripts

Weekly collection scripts aggregate provenance records into KPI measurements:

scripts/collect-kpis.sh

#!/usr/bin/env bash
set -euo pipefail

PERIOD_START="${1:?Usage: collect-kpis.sh <start-date> <end-date>}"
PERIOD_END="${2:?Usage: collect-kpis.sh <start-date> <end-date>}"
OUTPUT_DIR="metrics"

mkdir -p "$OUTPUT_DIR"

# Collect productivity metrics
node scripts/aggregate-productivity.js \
--start "$PERIOD_START" \
--end "$PERIOD_END" \
--provenance-dir provenance/ \
--output "$OUTPUT_DIR/productivity-${PERIOD_START}.json"

# Collect risk metrics
node scripts/aggregate-risk.js \
--start "$PERIOD_START" \
--end "$PERIOD_END" \
--provenance-dir provenance/ \
--output "$OUTPUT_DIR/risk-${PERIOD_START}.json"

# Validate all outputs
for metric_file in "$OUTPUT_DIR"/*.json; do
npx ajv validate -s schemas/kpi-record.schema.json -d "$metric_file"
done

echo "KPI collection complete for period $PERIOD_START to $PERIOD_END"

Key Metrics Collected

CategoryMetricSourceUnit
ProductivityAI contribution ratioProvenance records (AI disclosure flags)Percentage
ProductivityPR cycle timeGit/GitHub APIHours
ProductivityDeployment frequencyCI pipeline runs to mainPer week
RiskDefect densityTest failure rate across provenance recordsDefects per KLOC
RiskSecurity scan pass rateSAST/SCA stage outcomesPercentage
RiskMutation score trendMutation testing stage outcomesPercentage
FinancialAI tool cost per developerBilling API integrationUSD per month
FinancialTime saved estimatePR cycle time delta vs baselineHours per sprint

Dashboard Integration Options

The metrics pipeline produces JSON records that can feed into various visualization tools:

Export KPI records to a time-series database (InfluxDB or Prometheus) and build Grafana dashboards. The Production tier includes pre-built dashboard definitions.

Option 2: GitHub Actions Summary

For simpler setups, render KPI summaries directly in GitHub Actions job summaries:

- name: KPI Summary
run: |
node scripts/render-kpi-summary.js --input metrics/ >> $GITHUB_STEP_SUMMARY

Option 3: Static Site

Generate a static HTML report from KPI records and publish it as a GitHub Pages site or artifact:

node scripts/render-kpi-report.js --input metrics/ --output reports/kpi-report.html

Option 4: CSV Export

Export KPI records as CSV for import into spreadsheets or BI tools:

node scripts/export-csv.js --input metrics/ --output reports/kpis.csv

Scheduling Collection

Set up a weekly cron job in GitHub Actions to run the collection automatically:

name: Weekly KPI Collection
on:
schedule:
- cron: '0 9 * * 1' # Every Monday at 9am UTC

jobs:
collect:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- run: ./scripts/collect-kpis.sh "$(date -d '7 days ago' +%Y-%m-%d)" "$(date +%Y-%m-%d)"
- uses: actions/upload-artifact@v4
with:
name: kpi-records
path: metrics/

Next Steps