Trust attributes and safeguards
The qualities systems must show and the artifacts and controls that prove or protect them. Reliability is consistency over time; robustness is resilience under stress - and a model card documents one model while a system card documents the assembled whole.
The final cluster pairs the qualities trustworthy systems must show with the artifacts and controls that prove or protect them.
- Fairness → outputs free from unjustified differential treatment or impact across individuals and groups.
- Safety → the system avoids physical, psychological or societal harm under intended and foreseeable use.
- Fail-safe plans → predefined mechanisms moving a failing system to a safe state (shutdown, fallback, human takeover).
- Watermarking → embedding detectable markers in AI-generated content to signal synthetic origin and provenance.
- Open-source software → publicly available source code to use, modify and distribute under licence; open weights raise their own governance questions.
Reliability = consistency over time in normal conditions. Robustness = resilience under stress - noisy, unexpected, adversarial or shifting inputs. Same correct answer every day is reliability; still correct under attack is robustness.
A Model card documents one model (purpose, data, versions, metrics, bias and explainability reports, evaluation across demographics). A System card documents the whole assembled multi-component system and its end-to-end behaviour.
Automated decision-making (ADM) is decisions made by technological means without meaningful human involvement - GDPR Article 22 territory, the opposite of human-in-the-loop design.
The exam plants its distractors on eight pairs - learn the one-line discriminator for each: explainability vs interpretability (why vs how) · misinformation vs disinformation (intent) · overfitting vs underfitting (memorised vs too simple) · model card vs system card (one model vs assembled system) · data drift vs data poisoning (natural change vs attacker) · generative vs discriminative (creates vs draws the boundary) · reliability vs robustness (consistency vs stress) · training vs validation vs input data (teaches vs tunes vs receives).