AI & Agent Development Glossary

50 terms covering AI agents, LLMs, and developer infrastructure. Each definition is self-contained and quotable.

A

A/B Testing

A/B testing compares two or more variants of a system by randomly assigning users to groups and measuring statistically significant differences in predefined outcome metrics.

MLOps

Agent Harness

An agent harness is the runtime environment that manages an AI agent's execution loop, tool access, permission boundaries, memory persistence, and conversation state.

Developer Tools

Agent Loop

An agent loop is the iterative cycle of observe, reason, act, and evaluate that an AI agent repeats until it completes a task or reaches a termination condition.

AI Agent Development

Agent Orchestration

Agent orchestration is the coordination layer that manages how multiple AI agents communicate, share context, delegate tasks, and resolve conflicts within a system.

AI Agent Development

Agentic AI

Agentic AI refers to artificial intelligence systems that autonomously plan, execute, and adapt multi-step tasks toward a goal without requiring human intervention at each step.

AI Agent Development

AI Agent Memory

AI agent memory is the system that persists information across interactions, enabling agents to recall past context, learn from experience, and maintain continuity between sessions.

AI Agent Development

AI Alignment

AI alignment is the research field dedicated to ensuring artificial intelligence systems reliably pursue goals that match human intentions, values, and ethical principles.

AI Safety

AI Coding Agent

An AI coding agent is an autonomous software development assistant that can read codebases, write code, run tests, debug errors, and commit changes with minimal human direction.

Developer Tools

AI Guardrails

AI guardrails are programmatic constraints and validation layers that prevent AI systems from generating harmful, off-topic, or policy-violating outputs during production use.

AI Safety

Attention Mechanism

An attention mechanism allows neural networks to dynamically focus on relevant parts of the input when producing each element of the output, weighting information by learned importance.

LLM Architecture

B

Blue-Green Deployment

Blue-green deployment maintains two identical production environments and switches traffic between them to enable zero-downtime releases with instant rollback capability.

DevOps/CI-CD

C

Canary Release

A canary release gradually routes a small percentage of production traffic to a new version while monitoring for errors before expanding to all users.

DevOps/CI-CD

Container Orchestration

Container orchestration automates the deployment, scaling, networking, and lifecycle management of containerized applications across clusters of machines.

Cloud Infrastructure

Content Delivery Network

A content delivery network (CDN) distributes cached copies of web content across geographically dispersed servers to reduce latency and improve load times for users worldwide.

Cloud Infrastructure

Context Engineering

Context engineering is the practice of designing and optimizing the information provided to a language model to maximize the relevance, accuracy, and efficiency of its outputs.

LLM Infrastructure

Context Window

A context window is the maximum number of tokens a language model can process in a single input-output interaction, encompassing both the prompt and the generated response.

LLM Infrastructure

Continuous Deployment

Continuous deployment automatically releases every code change that passes automated testing directly to production without manual approval gates.

DevOps/CI-CD

D

Data Pipeline

A data pipeline is an automated sequence of processing steps that ingests, transforms, validates, and delivers data from source systems to destination systems for analysis or model training.

MLOps

E

Edge Computing

Edge computing processes data at or near the source of data generation rather than in a centralized data center, reducing latency and bandwidth consumption.

Cloud Infrastructure

Embedding

An embedding is a dense numerical vector representation of text, images, or other data that captures semantic meaning in a format suitable for mathematical comparison and retrieval.

Machine Learning

Experiment Tracking

Experiment tracking systematically records machine learning training runs including hyperparameters, metrics, artifacts, and code versions to enable comparison and reproducibility.

MLOps

F

Feature Flags

Feature flags are conditional switches in code that enable or disable functionality at runtime without deploying new code, decoupling deployment from feature release.

DevOps/CI-CD

Fine-Tuning

Fine-tuning is the process of further training a pre-trained language model on a domain-specific dataset to improve its performance on targeted tasks without training from scratch.

Machine Learning

Function Calling

Function calling is an LLM capability that allows models to generate structured JSON arguments for predefined functions, enabling AI to interact with external systems and APIs.

AI Agent Development

G

Generative Engine Optimization (GEO)

Generative engine optimization is the practice of structuring web content to maximize its likelihood of being cited, quoted, or referenced by AI systems when generating answers.

Search & Discovery

GitOps

GitOps is an operational framework that uses Git repositories as the single source of truth for declarative infrastructure and application configuration with automated reconciliation.

DevOps/CI-CD

H

Hallucination

Hallucination in AI refers to model outputs that are fluent and confident but factually incorrect, unsupported by training data, or inconsistent with provided context.

AI Safety

I

Inference

Inference is the process of running a trained machine learning model on new input data to generate predictions, classifications, or text outputs in real time.

LLM Infrastructure

Infrastructure as Code

Infrastructure as Code (IaC) manages and provisions computing infrastructure through machine-readable configuration files rather than manual processes or interactive tools.

Cloud Infrastructure

K

KV Cache

KV cache is a mechanism that stores previously computed key-value attention pairs during language model inference to avoid redundant computation when generating sequential tokens.

LLM Infrastructure

L

Large Language Model (LLM)

A large language model is a neural network trained on massive text datasets that generates human-like text by predicting the most probable next tokens in a sequence.

Machine Learning

M

MCP Server

An MCP server is a lightweight program that exposes tools, resources, and prompts to AI applications through the Model Context Protocol's standardized client-server interface.

Developer Tools

Mixture of Experts

Mixture of Experts (MoE) is a neural network architecture that routes each input to a subset of specialized sub-networks, enabling massive model capacity with efficient per-token computation.

LLM Architecture

Model Context Protocol (MCP)

Model Context Protocol is an open standard that defines how AI applications connect to external data sources and tools through a unified client-server interface.

Developer Tools

Model Distillation

Model distillation transfers knowledge from a large teacher model to a smaller student model by training the student to match the teacher's output distributions rather than hard labels.

LLM Architecture

Model Registry

A model registry is a centralized repository that stores, versions, and manages machine learning model artifacts along with their metadata, lineage, and deployment status.

MLOps

Model Serving

Model serving deploys trained machine learning models as production services that accept inference requests and return predictions with low latency and high availability.

MLOps

Multi-Agent System

A multi-agent system is an architecture where multiple specialized AI agents collaborate, communicate, and coordinate to solve problems that exceed any single agent's capabilities.

AI Agent Development

Multimodal AI

Multimodal AI refers to systems that can process, understand, and generate content across multiple data types including text, images, audio, and video within a unified model.

Machine Learning

P

Prompt Engineering

Prompt engineering is the practice of crafting and refining instructions given to language models to elicit accurate, relevant, and properly formatted outputs for specific tasks.

Machine Learning

Q

Quantization

Quantization reduces neural network memory usage and accelerates inference by converting model weights from high-precision floating point to lower-precision integer representations.

LLM Architecture

R

Red Teaming

Red teaming in AI involves systematically probing AI systems for vulnerabilities, biases, and failure modes by simulating adversarial attacks and edge-case scenarios.

AI Safety

Retrieval-Augmented Generation (RAG)

Retrieval-augmented generation is an architecture that enhances language model outputs by retrieving relevant documents from external knowledge sources and including them in the model's context.

Search & Discovery

RLHF

RLHF (Reinforcement Learning from Human Feedback) trains AI models to align with human preferences by using human judgment as a reward signal to fine-tune model behavior.

AI Safety

S

Serverless Computing

Serverless computing is a cloud execution model where the provider dynamically allocates resources and bills only for actual compute time used during function invocations.

Cloud Infrastructure

Structured Content

Structured content is information organized with consistent formatting, semantic markup, and machine-readable metadata that enables automated processing by search engines and AI systems.

Search & Discovery

V

Vector Database

A vector database is a specialized storage system designed to index, store, and perform fast similarity searches over high-dimensional embedding vectors at scale.

Search & Discovery

← Back to Home

AI & Agent Development Glossary

A

A/B Testing

Agent Harness

Agent Loop

Agent Orchestration

Agentic AI

AI Agent Memory

AI Alignment

AI Coding Agent

AI Guardrails

Attention Mechanism

B

Blue-Green Deployment

C

Canary Release

Container Orchestration

Content Delivery Network

Context Engineering

Context Window

Continuous Deployment

D

Data Pipeline

E

Edge Computing

Embedding

Experiment Tracking

F

Feature Flags

Fine-Tuning

Function Calling

G

Generative Engine Optimization (GEO)

GitOps

H

Hallucination

I

Inference

Infrastructure as Code

K

KV Cache

L

Large Language Model (LLM)

M

MCP Server

Mixture of Experts

Model Context Protocol (MCP)

Model Distillation

Model Registry

Model Serving

Multi-Agent System

Multimodal AI

P

Prompt Engineering

Q

Quantization

R

Red Teaming

Retrieval-Augmented Generation (RAG)

RLHF

S

Serverless Computing

Structured Content

T

Token Budget

Tool Use (Function Calling)

Transformer Architecture

V

Vector Database