Metrics, Thresholds, Audits and Monitoring
Establish measures (e.g. the Adverse Impact Ratio), set thresholds, baseline, then monitor over time. Audits assess performance, reliability and safety; in production, run an inventory with risk scores and a champion vs challenger model. Watch the four AI-specific threats - inversion, extraction, poisoning, evasion.
Numbers first, then the assurance machinery wrapped around them.
- Establish measures → e.g., the Adverse Impact Ratio (AIR) to assess outputs for bias
- Set technical or legal thresholds → aligned with industry standards and regulation
- Create baselines and benchmarks → performance lands over, at or under threshold
- Monitor over time → automated tools track deviations, alerts on significant changes
"Audit" can mean assessing computational performance OR a comprehensive assessment of the whole governance framework → policy plus technical controls. Aims → reduce risk, build trust, improve performance, ensure compliance → governments are beginning to require accountability mechanisms aligned to use case and risk level.
| Lens | Question |
|---|---|
| Performance | Does it work? → does the system achieve its intended goals |
| Reliability | Does it work in real-world conditions, over time? → consistency and robustness |
| Safety | Does it work without causing undue harm? → and how the operational context affects safety |
Internal policies should call for audits including → algorithmic impact assessments · bias and fairness testing · explainability and interpretability evaluations · data governance and quality review · regulatory compliance verification · accountability and human oversight confirmation. Auditors may be internal or external → with no widely adopted AI precedents yet, adapt frameworks from security or financial systems.
Manage & monitor in production. Inventory all AI systems and attach a risk score to each → drives resource allocation, review frequency, drift checks and audit allocation → retrain with new data and human feedback → keep a clear deactivation or localisation procedure. Develop a "challenger model" to test against the existing "champion model" → assess for drift and unexpected results. Security-wise, follow NIST RMF basics but add AI-specific threats → model inversion, extraction, poisoning and evasion.
Two recall sets → AIR = Adverse Impact Ratio, the bias metric. And the four AI-specific security threats existing protocols miss → inversion, extraction, poisoning, evasion.