Module 6: Governing AI Development · BoK III.A

Metrics, Thresholds, Audits and Monitoring

Establish measures (e.g. the Adverse Impact Ratio), set thresholds, baseline, then monitor over time. Audits assess performance, reliability and safety; in production, run an inventory with risk scores and a champion vs challenger model. Watch the four AI-specific threats - inversion, extraction, poisoning, evasion.

Numbers first, then the assurance machinery wrapped around them.

Establish measures → e.g., the Adverse Impact Ratio (AIR) to assess outputs for bias
Set technical or legal thresholds → aligned with industry standards and regulation
Create baselines and benchmarks → performance lands over, at or under threshold
Monitor over time → automated tools track deviations, alerts on significant changes

System audits

"Audit" can mean assessing computational performance OR a comprehensive assessment of the whole governance framework → policy plus technical controls. Aims → reduce risk, build trust, improve performance, ensure compliance → governments are beginning to require accountability mechanisms aligned to use case and risk level.

Three audit lenses
Lens	Question
Performance	Does it work? → does the system achieve its intended goals
Reliability	Does it work in real-world conditions, over time? → consistency and robustness
Safety	Does it work without causing undue harm? → and how the operational context affects safety

Internal policies should call for audits including → algorithmic impact assessments · bias and fairness testing · explainability and interpretability evaluations · data governance and quality review · regulatory compliance verification · accountability and human oversight confirmation. Auditors may be internal or external → with no widely adopted AI precedents yet, adapt frameworks from security or financial systems.

Manage & monitor in production. Inventory all AI systems and attach a risk score to each → drives resource allocation, review frequency, drift checks and audit allocation → retrain with new data and human feedback → keep a clear deactivation or localisation procedure. Develop a "challenger model" to test against the existing "champion model" → assess for drift and unexpected results. Security-wise, follow NIST RMF basics but add AI-specific threats → model inversion, extraction, poisoning and evasion.

Exam flash

Two recall sets → AIR = Adverse Impact Ratio, the bias metric. And the four AI-specific security threats existing protocols miss → inversion, extraction, poisoning, evasion.

Key terms - quick answers

What is “Adverse Impact Ratio (AIR)”?

A measure used to assess AI outputs for bias.

What is “Champion vs challenger”?

Developing a challenger model to test against the existing champion model to assess drift and unexpected results.

What is “Model inversion / extraction / poisoning / evasion”?

Four AI-specific security threats existing protocols miss.

← Testing and Validation Documentation, Communication and Decommissioning →