Metrics Pipeline Setup

git clone https://github.com/AEEF-AI/aeef-transform.git

The metrics pipeline automates the collection and validation of AEEF Key Performance Indicators (KPIs). It captures data from CI runs, code reviews, and security scans, then structures that data into schema-validated records suitable for dashboard visualization and trend analysis.

For the full list of metric definitions, see the KPI Framework.

KPI Schema Overview

All metrics records conform to the kpi-record.schema.json schema:

{
  "$schema": "https://json-schema.org/draft/2020-12/schema",
  "type": "object",
  "required": ["recordId", "timestamp", "period", "category", "metrics"],
  "properties": {
    "recordId": { "type": "string", "format": "uuid" },
    "timestamp": { "type": "string", "format": "date-time" },
    "period": {
      "type": "object",
      "properties": {
        "start": { "type": "string", "format": "date" },
        "end": { "type": "string", "format": "date" },
        "cadence": { "enum": ["daily", "weekly", "sprint", "monthly"] }
      }
    },
    "category": {
      "enum": ["productivity", "risk", "financial"]
    },
    "metrics": {
      "type": "array",
      "items": {
        "type": "object",
        "required": ["name", "value", "unit"],
        "properties": {
          "name": { "type": "string" },
          "value": { "type": "number" },
          "unit": { "type": "string" },
          "target": { "type": "number" },
          "threshold": { "type": "number" }
        }
      }
    },
    "source": {
      "type": "object",
      "properties": {
        "repository": { "type": "string" },
        "pipeline": { "type": "string" },
        "collector": { "type": "string" }
      }
    }
  }
}

Provenance Record Generation

Every CI pipeline run generates a provenance record that captures the full build context. This record is the foundational data source for KPI aggregation.

What a Provenance Record Contains

Build metadata -- Repository, branch, commit SHA, build ID, timestamp
Stage outcomes -- Pass/fail status and duration for each pipeline stage
Coverage data -- Line coverage, branch coverage, mutation score
Security findings -- SAST finding count, SCA vulnerability count
AI contribution -- Whether AI tools were disclosed, which tools were used
Dependency snapshot -- Package count, license distribution

Generation Script

The provenance generator runs as a post-pipeline step:

# TypeScript
node scripts/generate-provenance.js --output provenance/

# Python
uv run python scripts/generate_provenance.py --output provenance/

# Go
go run scripts/generate_provenance.go --output provenance/

Generated records are stored in the provenance/ directory and validated against the schema in CI.

Data Collection Scripts

Weekly collection scripts aggregate provenance records into KPI measurements:

`scripts/collect-kpis.sh`

#!/usr/bin/env bash
set -euo pipefail

PERIOD_START="${1:?Usage: collect-kpis.sh <start-date> <end-date>}"
PERIOD_END="${2:?Usage: collect-kpis.sh <start-date> <end-date>}"
OUTPUT_DIR="metrics"

mkdir -p "$OUTPUT_DIR"

# Collect productivity metrics
node scripts/aggregate-productivity.js \
  --start "$PERIOD_START" \
  --end "$PERIOD_END" \
  --provenance-dir provenance/ \
  --output "$OUTPUT_DIR/productivity-${PERIOD_START}.json"

# Collect risk metrics
node scripts/aggregate-risk.js \
  --start "$PERIOD_START" \
  --end "$PERIOD_END" \
  --provenance-dir provenance/ \
  --output "$OUTPUT_DIR/risk-${PERIOD_START}.json"

# Validate all outputs
for metric_file in "$OUTPUT_DIR"/*.json; do
  npx ajv validate -s schemas/kpi-record.schema.json -d "$metric_file"
done

echo "KPI collection complete for period $PERIOD_START to $PERIOD_END"

Key Metrics Collected

Category	Metric	Source	Unit
Productivity	AI contribution ratio	Provenance records (AI disclosure flags)	Percentage
Productivity	PR cycle time	Git/GitHub API	Hours
Productivity	Deployment frequency	CI pipeline runs to main	Per week
Risk	Defect density	Test failure rate across provenance records	Defects per KLOC
Risk	Security scan pass rate	SAST/SCA stage outcomes	Percentage
Risk	Mutation score trend	Mutation testing stage outcomes	Percentage
Financial	AI tool cost per developer	Billing API integration	USD per month
Financial	Time saved estimate	PR cycle time delta vs baseline	Hours per sprint

Dashboard Integration Options

The metrics pipeline produces JSON records that can feed into various visualization tools:

Option 1: Grafana (Recommended for Production Tier)

Export KPI records to a time-series database (InfluxDB or Prometheus) and build Grafana dashboards. The Production tier includes pre-built dashboard definitions.

Option 2: GitHub Actions Summary

For simpler setups, render KPI summaries directly in GitHub Actions job summaries:

- name: KPI Summary
  run: |
    node scripts/render-kpi-summary.js --input metrics/ >> $GITHUB_STEP_SUMMARY

Option 3: Static Site

Generate a static HTML report from KPI records and publish it as a GitHub Pages site or artifact:

node scripts/render-kpi-report.js --input metrics/ --output reports/kpi-report.html

Option 4: CSV Export

Export KPI records as CSV for import into spreadsheets or BI tools:

node scripts/export-csv.js --input metrics/ --output reports/kpis.csv

Scheduling Collection

Set up a weekly cron job in GitHub Actions to run the collection automatically:

name: Weekly KPI Collection
on:
  schedule:
    - cron: '0 9 * * 1'  # Every Monday at 9am UTC

jobs:
  collect:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - run: ./scripts/collect-kpis.sh "$(date -d '7 days ago' +%Y-%m-%d)" "$(date +%Y-%m-%d)"
      - uses: actions/upload-artifact@v4
        with:
          name: kpi-records
          path: metrics/

Next Steps

KPI definitions: KPI Framework for the full metric catalog
Productivity metrics: Productivity Metrics for detailed measurement guidance
Production monitoring: Monitoring Setup for Grafana dashboards and alerting

KPI Schema Overview​

Provenance Record Generation​

What a Provenance Record Contains​

Generation Script​

Data Collection Scripts​

scripts/collect-kpis.sh​

Key Metrics Collected​

Dashboard Integration Options​

Option 1: Grafana (Recommended for Production Tier)​

Option 2: GitHub Actions Summary​

Option 3: Static Site​

Option 4: CSV Export​

Scheduling Collection​

Next Steps​