AI & Prompt Engineering Glossary
110 essential terms defined with examples. From foundational concepts to advanced techniques, learn the language of AI engineering.
Core Concepts
AI Alignment
The research and engineering challenge of ensuring AI systems behave in ways that are helpful, harmless, and consistent...
AI Safety
The field focused on preventing AI systems from causing unintended harm, both in current applications and as systems...
Activation Function
A mathematical function applied to each neuron's output in a neural network that determines whether and how strongly...
Adversarial Examples
Inputs deliberately crafted to fool AI models into making incorrect predictions or producing unintended outputs. These...
Attention Mechanism
The core innovation in transformers that allows models to weigh the importance of different parts of the input when...
BERT
A pre-trained language model from Google that reads text in both directions simultaneously, giving it a deeper...
Benchmarks
Standardized tests used to compare AI model performance across specific capabilities. Benchmarks provide consistent...
Bias-Variance Tradeoff
A fundamental tension in machine learning between two types of error. Bias is error from oversimplifying the problem...
Classifier
A model or system that assigns input data to predefined categories. In AI applications, classifiers sort text, images,...
Context Window
The maximum amount of text (measured in tokens) that a language model can process in a single request, including both...
Cosine Similarity
A mathematical measure of how similar two vectors are, based on the angle between them rather than their magnitude. In...
Decision Boundary
The line, surface, or region in feature space where a model switches from predicting one class to another. It...
Deep Learning
A subset of machine learning that uses neural networks with many layers (hence 'deep') to learn complex patterns from...
Diffusion Models
A class of generative AI models that create data (typically images) by learning to reverse a gradual noising process....
Dimensionality Reduction
Techniques that reduce the number of features (dimensions) in a dataset while preserving the most important...
Embeddings
Dense numerical representations of text, images, or other data in a high-dimensional vector space. Similar items are...
Emergent Abilities
Capabilities that appear in large language models only after they reach a certain scale, without being explicitly...
Feature Extraction
The process of transforming raw data into meaningful numerical representations (features) that a model can use for...
GPT
A family of large language models developed by OpenAI that generate text by predicting the next token in a sequence....
Guardrails
Safety mechanisms and constraints built around AI systems to prevent harmful, off-topic, or undesirable outputs....
Hallucination
When a language model generates information that sounds plausible but is factually incorrect, fabricated, or not...
HumanEval
A coding benchmark created by OpenAI that tests AI models on 164 Python programming problems. Each problem provides a...
Large Language Model
A neural network trained on massive text datasets that can understand and generate human language. LLMs like GPT-4,...
MMLU
A benchmark that tests AI models across 57 academic subjects including math, history, law, medicine, and computer...
Model Evaluation
The systematic process of measuring how well an AI model performs on specific tasks. Model evaluation uses test...
Multimodal AI
AI systems that can process and generate multiple types of data — text, images, audio, video, or code — within a single...
Natural Language Processing
The branch of AI focused on enabling computers to understand, interpret, and generate human language. NLP encompasses...
Neural Network
A computing system inspired by biological brains, consisting of layers of interconnected nodes (neurons) that process...
Precision and Recall
Two complementary metrics for evaluating classification models. Precision measures how many of the model's positive...
Prompt Engineering
The practice of designing and optimizing inputs to large language models (LLMs) to produce accurate, relevant, and...
Prompt Injection
A security vulnerability where malicious user input overrides or manipulates a language model's system prompt or...
Reasoning Models
A category of AI models specifically designed to perform multi-step logical reasoning before producing a final answer....
Self-Attention
The mechanism inside transformer models that allows each token in a sequence to look at and weigh the relevance of...
Softmax
A mathematical function that converts a vector of raw numbers (logits) into a probability distribution where all values...
System Prompt
A special instruction given to a language model that sets its behavior, personality, constraints, and role for an...
Tokens
The basic units that language models use to process text. A token is typically a word, part of a word, or a punctuation...
Transformer
The neural network architecture behind virtually all modern large language models. Introduced in the 2017 paper...
Word Embeddings
Dense vector representations that capture the meaning of words as points in a multi-dimensional space. Words with...
Prompting Techniques
Chain-of-Thought Prompting
A prompting technique where the model is instructed to break down complex problems into intermediate reasoning steps...
Few-Shot Prompting
A technique where you provide a small number of input-output examples in your prompt to demonstrate the desired task...
In-Context Learning
A model's ability to learn new tasks or patterns from examples provided directly in the prompt, without any weight...
JSON Mode
A model configuration that constrains a language model to output only valid JSON. When enabled, the model's output is...
Output Parsing
The process of extracting structured data from an LLM's text response and converting it into a usable format like JSON,...
Prompt Chaining
A technique where the output of one prompt becomes the input for the next, creating a sequential pipeline of AI...
Prompt Optimization
The systematic process of improving prompt performance through testing, measurement, and iteration. Prompt optimization...
Prompt Template
A reusable prompt structure with placeholder variables that get filled in at runtime. Prompt templates separate the...
Zero-Shot Prompting
Giving a language model a task instruction without any examples, relying entirely on the model's pre-trained knowledge...
Architecture Patterns
AI Agent
An AI system that can autonomously plan and execute multi-step tasks by using tools, making decisions, and iterating...
Agentic AI
An approach to AI system design where models autonomously plan, execute, and iterate on complex tasks with minimal...
Autoencoder
A neural network architecture that learns to compress data into a smaller representation and then reconstruct the...
Convolutional Neural Network
A neural network architecture designed to process grid-structured data like images. CNNs use small learnable filters...
Ensemble Methods
Techniques that combine multiple models to produce better predictions than any individual model. By aggregating diverse...
Function Calling
A capability where language models can generate structured JSON that maps to predefined function signatures, allowing...
GAN
An architecture where two neural networks compete against each other: a generator that creates fake data and a...
Graph Neural Network
A neural network designed to work with graph-structured data, where information is represented as nodes connected by...
Grounding
The practice of connecting language model outputs to verified, factual sources of information. Grounding techniques...
Knowledge Graph
A structured representation of information as a network of entities (nodes) and their relationships (edges). Knowledge...
LSTM
A specialized type of recurrent neural network (RNN) designed to learn long-range dependencies in sequential data....
Mixture of Experts
A model architecture where multiple specialized sub-networks (experts) exist within a single model, but only a subset...
Model Context Protocol
An open standard developed by Anthropic that defines how AI models connect to external data sources and tools. MCP...
RAG
An architecture pattern that combines information retrieval with text generation. RAG systems first search a knowledge...
Recurrent Neural Network
A neural network architecture designed for sequential data where the output from the previous step feeds back as input...
Retrieval
The process of finding and fetching relevant information from a data source to provide context for an AI model's...
Semantic Search
A search approach that understands the meaning and intent behind a query rather than just matching keywords. Semantic...
Structured Output
Model responses that conform to a predefined schema or format, such as JSON matching a specific structure, XML, or...
Tool Use
The capability of AI models to interact with external tools, APIs, and systems by generating structured requests during...
Variational Autoencoder
A generative model that learns to encode data into a smooth, continuous probability distribution (the latent space) and...
Model Parameters
Cross-Entropy
A mathematical measure of the difference between a model's predicted probability distribution and the actual...
Hyperparameters
Settings that control how a model trains or generates output, set by the user rather than learned by the model itself....
Parameter Count
The total number of trainable weights and biases in a neural network, typically used as a rough indicator of model...
Perplexity
A statistical measure of how well a language model predicts a sequence of text. Lower perplexity means the model is...
Temperature
A model parameter (typically 0 to 2) that controls the randomness of outputs. Lower temperature (0-0.3) produces more...
Top-P Sampling
A text generation parameter that limits the model's token selection to the smallest set of tokens whose cumulative...
Model Training
Backpropagation
The algorithm that neural networks use to learn from mistakes. It works backward through the network, calculating how...
Catastrophic Forgetting
A phenomenon where a neural network, when trained on new data, loses knowledge it previously learned. The model's...
Constitutional AI
An alignment technique developed by Anthropic where AI systems are trained to follow a set of principles (a...
Contrastive Learning
A training approach where models learn by comparing similar and dissimilar examples. Instead of labeling data directly,...
Cross-Validation
A technique for estimating how well a model will perform on unseen data by repeatedly splitting the available data into...
DPO
A simpler alternative to RLHF that skips training a separate reward model. DPO directly optimizes a language model...
Data Augmentation
Techniques for artificially expanding a training dataset by creating modified versions of existing data. In NLP, this...
Dropout
A regularization technique where randomly selected neurons are temporarily deactivated (set to zero) during each...
Fine-Tuning
The process of taking a pre-trained language model and training it further on a specific dataset to specialize its...
Gradient Descent
The core optimization algorithm used to train neural networks. It works by calculating how much each model weight...
Instruction Tuning
A fine-tuning technique where a pre-trained model is trained on a dataset of instruction-response pairs to improve its...
Knowledge Distillation
A training technique where a smaller 'student' model learns to replicate the behavior of a larger 'teacher' model. The...
LoRA
A parameter-efficient fine-tuning technique that freezes the original model weights and injects small trainable...
Loss Function
A mathematical function that measures how far a model's predictions are from the correct answers during training. The...
Model Collapse
A degradation phenomenon where AI models trained on AI-generated data progressively lose quality, diversity, and...
Normalization
Techniques that rescale data or neural network activations to a standard range, improving training stability and speed....
Overfitting
When a model learns the training data too well, memorizing noise and specific patterns that don't generalize to new...
RLHF
A training technique where human evaluators rank model outputs by quality, and these rankings are used to train a...
Reinforcement Learning
A training approach where an agent learns by interacting with an environment, receiving rewards for good actions and...
Stochastic Gradient Descent
An optimization algorithm that updates model weights using the gradient computed from a random subset (mini-batch) of...
Synthetic Data
Artificially generated data created by AI models or algorithms rather than collected from real-world sources. Synthetic...
Transfer Learning
The practice of taking a model trained on one task or dataset and adapting it for a different but related task. Instead...
Infrastructure
AI Coding Assistant
Software tools that use AI models to help developers write, edit, debug, and understand code. AI coding assistants...
API Rate Limiting
Controls imposed by API providers that restrict how many requests you can make within a given time period. Rate limits...
Batch Processing
Running multiple AI model requests as a group rather than one at a time. Batch processing trades latency for throughput...
Edge AI
Running AI models directly on local devices (phones, laptops, IoT sensors, vehicles) rather than sending data to cloud...
Federated Learning
A training approach where the model goes to the data instead of the data going to the model. Multiple devices or...
Inference
The process of running a trained model to generate predictions or outputs from new inputs. In the context of LLMs,...
LangChain
An open-source framework for building applications with large language models. LangChain provides abstractions for...
Latency
The time delay between sending a request to an AI model and receiving the response. In LLM applications, latency...
OpenAI API
The programmatic interface for accessing OpenAI's language models (GPT-4, GPT-4o, o1, and others). The API allows...
Prompt Caching
An optimization where the API provider stores the processed representation of frequently repeated prompt prefixes,...
Quantization
A technique that reduces model size and memory usage by representing weights with fewer bits — for example, converting...
Streaming
A technique where model responses are delivered token by token as they're generated, rather than waiting for the...
Throughput
The number of tokens or requests an AI system can process per unit of time. High throughput means handling more users...
Tokenizer
The component that converts raw text into the sequence of tokens a model can process, and converts tokens back into...
Vector Database
A specialized database designed to store, index, and query high-dimensional vectors (embeddings). Vector databases...
Go Deeper
Our complete prompt engineering guide covers these concepts in practice, with real-world examples and techniques you can use today.
Read the Guide →