Module 6: Governing AI Development · BoK III.A

Features and Feature Engineering

A feature is a specific measurable aspect or characteristic. Feature engineering decides which ones matter, with three purposes - improve performance (the most important), cut cost, and boost explainability. Feature flags toggle functionality without redeploying.

A Feature is a specific measurable aspect or characteristic → height, colour, substance. Feature engineering decides which ones matter.

Pick what matters

For a credit score, a person's age matters more than their height → subject matter experts help pick essential features. Use the same features for training and testing for consistency → and eliminate unnecessary features, which complicate testing and waste resources.

Three purposes of feature engineering:

Improve performance → the most important purpose → structure datasets so the model optimally learns feature-to-target relationships → curate the subset with the greatest predictive power
Cut cost, boost effectiveness → fewer features → less data to process, store and send in API calls → better latency → write versioned, tested feature definitions mirrored for training and production
Boost explainability → the degree someone can consistently predict a model's result → essential for fairness, privacy, reliability, robustness, causality and trust

Feature flags

Feature flags let you disable functionality without redeploying code → introduce features while controlling availability to specific users or groups in production → handy for rollbacks and deployments across jurisdictions with varying requirements.

Key terms - quick answers

What is “Feature”?

A specific measurable aspect or characteristic, e.g. height, colour or substance.

What is “Feature engineering”?

Deciding which features matter - to improve performance, cut cost and boost explainability.

What is “Feature flags”?

Mechanism to disable functionality without redeploying code and control availability to specific users/groups.

What is “Explainability”?

The degree someone can consistently predict a model's result; essential for fairness, privacy, reliability, robustness, causality and trust.

← Wrangling the Data Building, Training and the Three Lines of Defence →