Module 7: Governing AI Deployment · BoK III.C + IV.C

Monitoring, maintenance and drift

Watch for deviations in accuracy and model drift - when the relationship between input data and output predictions changes over time. Model cards document features, data, versions and intended use. Two predictable post-deployment risks (a new purpose, new data) are mitigated by documentation and snapshots. Keep a 4-step underperformance response and a human shutdown.

Watch for deviations in accuracy, irregular decisions and drifts in data. New risks appear post-deployment when the AI meets new purposes or new data.

Model drift, defined → the relationship between input data and output predictions changes over time → the conditions the model was trained under no longer apply and performance declines. Example → a spam detector failing on new types of spam.
Model cards, in full → standardised records of key features, data used, number of versions, bias or explainability reports, intended use, performance metrics and benchmarked evaluation across cultures, demographics or race → document the original purpose and any new purposes.
The two predictable risks → a new purpose the AI was not modelled for (mitigate with documentation/model cards) and new data entering the algorithm (mitigate by keeping snapshots of the algorithm and its outputs so you can roll back).
Best practices → define a baseline, retrain with new data plus human input and feedback, prioritise risk levels and responses, red teaming internally or externally, including pre-deployment, and bug bashing and bug bounties for engagement and feedback.

Deactivation & the 4-step response

Keep a procedure to deactivate or localise the system → a human must be able to shut down the algorithm remotely or without direct access. When the AI underperforms → 1) treat it as an incident and use the response plan · 2) identify the issue, who must be told, document the mitigation · 3) notify groups using integrated third-party tools · 4) enable the human shutdown.

Key terms - quick answers

What is “Model drift”?

When the relationship between input data and output predictions changes over time, degrading performance.

What is “Model card”?

Standardised record of a model's features, data, versions, bias/explainability reports, intended use and performance metrics.

What is “Snapshots”?

Saved states of the algorithm and its outputs enabling rollback and change comparison.

What is “Bug bounty”?

Programme rewarding external discovery of flaws, used for monitoring engagement and feedback.

← Public disclosures and transparency obligations Incidents, consequences and accountability →