AGI (Artificial General Intelligence)
A hypothetical AI system that can perform most intellectual tasks at a human level across many domains.
Essential technical terminology explained at the highest level of clarity. Designed for researchers, students, and human-centered education.
Showing 173 matching terms.
A hypothetical AI system that can perform most intellectual tasks at a human level across many domains.
A software system that can observe, reason, and take actions to achieve a goal, often using tools and memory.
The work of making AI systems behave according to human intentions, values, and safety constraints.
Policies, standards, and oversight mechanisms that guide how AI is developed and used in society.
A field focused on reducing harmful behavior, failures, and misuse risks in AI systems.
A defined set of rules or steps that a computer follows to solve a problem or complete a task.
Systematic unfairness in model outputs caused by skewed data, assumptions, or modeling choices.
How clearly an AI system's logic, data sources, and limitations are documented and understandable.
Human-added labels or metadata used to train or evaluate machine learning models.
A structured way for one software system to send requests to and receive responses from another system.
The broad field of building systems that perform tasks requiring pattern recognition, reasoning, language, or decision-making.
A model component that dynamically focuses on relevant parts of an input when producing an output.
A system that can make decisions and act with limited or no direct human control in real time.
The core training algorithm that updates model weights by propagating prediction errors backward through the network.
A simple reference model used to compare whether more complex approaches actually improve results.
A standardized test or dataset used to measure and compare model performance.
A consistent pattern of error or unfairness in data or model behavior.
Very large and complex datasets that require scalable storage and processing techniques.
A model whose internal reasoning is difficult to interpret directly by humans.
How well a model's confidence scores match actual correctness probabilities.
A reasoning style where an AI model decomposes a problem into intermediate steps.
A task where a model assigns an input to one or more predefined categories.
A model designed specifically for classification tasks.
A multimodal model architecture that learns shared representations between text and images.
The processing resources required to train and run models, often measured in FLOPS or GPU hours.
The branch of AI that extracts meaning from images and video.
The maximum amount of input tokens a language model can process at once.
Training approaches that let a model keep learning from new data without forgetting prior knowledge.
A neural architecture optimized for processing grid-like data such as images.
A common objective function used to train classification models by penalizing incorrect probabilities.
Techniques that create modified training examples to improve model generalization.
A shift in real-world input data over time that can degrade model performance.
The process of assigning tags or target outputs to raw data for supervised learning.
A collection of structured or unstructured examples used for training, validation, or testing.
The surface in feature space that separates classes predicted by a classifier.
A model that makes predictions through a sequence of if-then feature splits.
A subset of machine learning that uses many-layer neural networks for representation learning.
A generative architecture that learns to reverse noise to synthesize images, audio, or other content.
Compressing knowledge from a large teacher model into a smaller student model.
Methods that transfer a model trained in one domain to perform better in another domain.
A numeric vector representation that captures semantic meaning of text, images, or other data.
The component of a model that transforms input into latent representations.
Combining predictions from multiple models to improve robustness or accuracy.
A held-out dataset used to measure model quality after training.
The degree to which a model's behavior can be interpreted and explained to humans.
An incorrect prediction where a model misses a true positive case.
An incorrect prediction where a model incorrectly flags a negative case as positive.
An input variable used by a model to make predictions.
Designing or transforming input variables to make learning easier and more effective.
Converting raw data into informative features that a model can use.
Learning or adapting behavior from only a small number of examples.
Continuing training on domain-specific data to adapt a pre-trained model to a specific task.
A large pre-trained model that can be adapted to many downstream tasks.
A model capability to generate structured calls that trigger external tools or APIs.
A generative setup where a generator and discriminator train against each other.
How well a model performs on new, unseen data outside the training set.
AI systems that produce new content such as text, images, audio, video, or code.
A vector showing how much each parameter should change to reduce loss.
An optimization method that updates parameters in the direction that reduces error.
Trusted reference labels used to train or evaluate model outputs.
Rules, checks, and controls that limit unsafe or undesired model behavior.
When a model generates fluent but false or unsupported information.
A workflow where humans review, guide, or override AI outputs.
A configuration value set before training, such as learning rate, batch size, or depth.
A model's ability to follow patterns from examples provided directly in the prompt.
The runtime phase where a trained model generates predictions or outputs.
The amount of processing power consumed while producing each response.
Fine-tuning a model on instruction-response pairs to improve task following.
Predicting the user's purpose from a text query to route it correctly.
A prompt technique intended to bypass a model's safety constraints.
The latest point in time reflected in a model's training data.
Training a smaller model to imitate the outputs of a larger model.
A graph structure of entities and relationships used for reasoning or retrieval.
A regularization method that softens hard labels to improve generalization.
The time between sending a request and receiving the model's output.
A language model trained on massive text corpora to generate and analyze text.
A training hyperparameter controlling how much parameters change each update step.
A parameter-efficient fine-tuning method that adds low-rank adapter matrices.
A mathematical objective that quantifies prediction error during training.
Methods that allow systems to learn patterns from data and improve over time.
Stored context an AI agent uses across steps or sessions to improve continuity.
An architecture with specialized subnetworks where only selected experts run per input.
Documentation describing a model's intended use, metrics, limitations, and risks.
Performance degradation over time as real-world conditions diverge from training assumptions.
Reducing numeric precision of model weights to decrease memory and inference cost.
A model that can process or generate multiple data types such as text, image, and audio.
An NLP task that identifies entities such as people, places, dates, or organizations.
The branch of AI focused on understanding and generating human language.
A layered computational model inspired by biological neurons and synapses.
Transforming values to a consistent scale to improve optimization stability.
Technology that converts text in images or scans into machine-readable text.
A model released with public weights or code for inspection, adaptation, and reuse.
When a model memorizes training data and performs poorly on unseen inputs.
A learned weight inside a model that influences its outputs.
Methods that adapt models by training a small subset of added parameters.
A language-model metric measuring how surprised the model is by true next tokens.
An ordered workflow of preprocessing, model steps, and postprocessing stages.
The proportion of predicted positives that are actually correct.
Initial large-scale model training on broad data before downstream adaptation.
The input instructions and context provided to a generative model.
Designing prompts to improve output quality, reliability, and controllability.
An attack pattern where malicious instructions are inserted into model inputs or retrieved content.
Removing less important model weights or neurons to reduce size and compute.
Converting model weights to lower precision formats such as 8-bit or 4-bit.
A method that retrieves external knowledge and feeds it into generation at inference time.
The proportion of actual positives that a model correctly identifies.
A model pipeline that predicts user preferences for ranking content or products.
Stress-testing an AI system with adversarial prompts to reveal failures and risks.
Training by reward signals where an agent learns actions that maximize long-term return.
A training method that uses human preference signals to shape model behavior.
Finding relevant documents or records from a knowledge source for a query.
A model that scores outputs based on preference signals, often used in RLHF pipelines.
A model's ability to maintain performance under noise, shifts, or adversarial inputs.
A moderation layer that blocks or rewrites unsafe model inputs or outputs.
An empirical relationship showing how performance improves with model size, data, or compute.
Search that matches meaning rather than exact keyword overlap, often using embeddings.
Learning representations from unlabeled data by predicting masked or transformed parts.
An NLP task that classifies emotional tone or opinion in text.
A compact language model optimized for lower latency, cost, or on-device usage.
A model where many parameters are zero or inactive to reduce computation.
Training a model with labeled examples that map inputs to known outputs.
Artificially generated data used to augment, simulate, or protect sensitive training data.
A high-priority instruction that sets behavior, policy, and response style for a model.
A sampling setting controlling randomness in generated outputs.
A chunk of text processed by language models, such as a word piece or symbol.
The process of splitting text into tokens for model input.
A model's ability to call external tools such as search, calculators, or APIs.
A decoding strategy that samples only from the k most likely next tokens.
A decoding strategy that samples from the smallest token set whose probabilities sum to p.
Applying knowledge learned in one task or domain to improve another task.
A neural architecture that uses attention to model relationships across sequences in parallel.
The model error value computed during training and optimized downward over time.
Learning patterns from unlabeled data without explicit target outputs.
A dataset used during development to tune models and prevent overfitting.
A database optimized for storing and querying high-dimensional embedding vectors.
A multimodal model that jointly processes visual and textual information.
Using noisy, heuristic, or partial labels to train models when clean labels are scarce.
A learned numeric value that scales signals passing through a neural network.
A dense vector representation of words capturing semantic relationships.
Techniques and practices for making AI predictions more transparent and understandable.
Solving tasks without task-specific examples by relying on prior general knowledge.
A multi-step process where an AI system plans, executes, checks results, and iterates toward a goal.
The European Union's risk-based regulatory framework for AI systems and providers.
The extra cost in time, compute, or product velocity required to make systems safer and more controllable.
When benchmark test examples or close variants are present in training data, inflating reported performance.
Methods for estimating cause-and-effect relationships rather than simple correlations.
A statistical range that likely contains the true value of a measured model metric.
A training and behavior-shaping approach where model outputs are guided by a fixed set of written principles.
A record of where data came from, how it was transformed, and where it is used.
The documented origin, ownership, and history of a dataset or model artifact.
A privacy technique that adds statistical noise so individual records cannot be reliably inferred from outputs.
A smaller model trained to imitate a larger model's behavior while using less compute at inference.
A model specialized for converting data into vectors used for semantic search, clustering, and retrieval.
A repeatable evaluation framework that runs prompts, datasets, and scoring logic across model versions.
A managed system for storing and serving validated ML features consistently for training and inference.
The degree to which an AI response is supported by source data or retrieved evidence.
A generation strategy that constrains output tokens to valid structures or policy-compliant choices.
A model trained on human rankings to predict which responses users are likely to prefer.
A deployed API interface that receives model requests and returns predictions in production.
A curated collection of documents or records used for retrieval, support automation, or grounding responses.
A compressed representational space where similar concepts are positioned near each other as vectors.
A central catalog for versioning, approving, and tracking models across environments.
AI inference performed locally on user hardware rather than in a remote cloud service.
Logic that validates and converts model output into strongly typed, machine-usable structures.
A reusable prompt pattern with variables, formatting rules, and task-specific instructions.
The proportion of retrieved items that are relevant to the user's query.
A structured argument, supported by evidence, that an AI system is safe for a defined context of use.
Running a model in parallel with production traffic without affecting user-facing decisions.
Model output constrained to a defined schema such as JSON, tool arguments, or typed fields.
Additional inference computation used during response generation to improve quality or reasoning.
Aligning user confidence in AI outputs with the system's actual reliability in each task.
Pricing where costs scale with API calls, tokens, inference time, or consumed compute.
A policy where request/response payloads are not stored after processing beyond short-lived operational windows.