Generative AI
The GenAI stack from foundation model to prompt - architecture names matter here. RAG is defined by retrieving from a knowledge base beyond the training data sources and grounding the answer in it.
Generative AI is the cluster of models that create new content - text, images, audio, video, code - from patterns learned in training data. The stack runs from architecture down to the prompt, and the architecture names are exam targets.
- Foundation model → trained on broad data at scale and adaptable to many downstream tasks via fine-tuning or prompting.
- Large language model (LLM) → a foundation model trained on massive text corpora, transformer-based; Small language models (SLM) are the compact, cheaper, on-device cousins.
- Multimodal models → process and/or generate more than one data type (text plus images, audio or video).
- Transformer model → uses self-attention to weigh relationships across a whole sequence at once - the backbone of modern LLMs.
- Diffusion model → learns to reverse gradually added noise to generate images or audio.
- Prompt and Prompt engineering → the input instruction, and the craft of refining it to steer output.
Retrieval augmented generation (RAG) optimises LLM output by retrieving from a knowledge base beyond the training data sources and grounding the answer in it. That phrase - "beyond the training data" - is the discriminator the exam looks for.
Generative vs discriminative is a classic trap → generative creates new content, discriminative draws the line between classes. And RAG ≠ adding parameters or resampling data - it references an external knowledge base.