Your comprehensive guide to AI terminology. From algorithms to neural networks, understand the language of artificial intelligence.
A step-by-step procedure or formula for solving a problem or accomplishing a task. In AI, algorithms are the instructions that tell computers how to learn from data.
Example: "Netflix uses recommendation algorithms to suggest shows you might like."
The simulation of human intelligence in machines that are programmed to think and learn like humans. AI encompasses various techniques including machine learning and deep learning.
Related: Machine Learning, Deep Learning, Neural Network
A technique that allows AI models to focus on specific parts of input data when generating output. It's like highlighting important words when reading.
Used in: Transformers, Large Language Models
A neural network that learns to compress and reconstruct data. It's trained to encode data into a smaller representation and then decode it back.
Use case: Image denoising, anomaly detection
Techniques that increase training data diversity by applying random transformations like rotation, flipping, or color changes to existing data.
Example: Rotating cat photos to create more training examples
The primary algorithm for training neural networks. It calculates gradients by propagating errors backward through the network, adjusting weights to minimize mistakes.
Think of it as: Learning from mistakes by working backwards through your thought process
A constant value added to calculations in a neural network (like y = mx + b). It helps shift the activation function to better fit the data.
Also means: Systematic unfairness in AI systems due to training data
Extremely large datasets that can be analyzed computationally to reveal patterns, trends, and associations, especially relating to human behavior.
Key characteristic: Volume, Velocity, and Variety (the 3 V's)
A layer in a neural network with significantly fewer neurons than surrounding layers, forcing compression of information.
Used in: Autoencoders, transfer learning
An AI program designed to simulate conversation with humans through text or voice interactions. Modern chatbots use NLP and LLMs for more natural dialogue.
Examples: ChatGPT, Claude, customer service bots
A supervised learning task where the goal is to categorize data into predefined classes or categories.
Example: Spam detection (spam vs. not spam)
An unsupervised learning technique that groups similar data points together without predefined categories.
Example: Customer segmentation for marketing
A deep learning architecture designed specifically for processing structured grid data like images. Uses filters to detect spatial features.
Used for: Image recognition, object detection, video analysis
A field of AI that trains computers to interpret and understand visual information from the world—images and videos.
Applications: Face recognition, self-driving cars, medical imaging
Creating modified versions of training data to increase dataset diversity and improve model generalization.
Techniques: Rotation, flipping, cropping, color adjustment
A subset of machine learning using neural networks with multiple layers (deep networks) to learn hierarchical representations of data.
Key difference: Can automatically discover features without manual feature engineering
A regularization technique that randomly "drops out" (ignores) neurons during training to prevent overfitting.
Analogy: Like randomly skipping study topics to prepare for unexpected exam questions
A dense vector representation of data (like words or images) where similar items are positioned close together in numerical space.
Example: "King" and "Queen" have similar embeddings because they have similar meanings
One complete pass through the entire training dataset during model training. Multiple epochs mean seeing the data multiple times.
More epochs: Usually means better learning, but too many causes overfitting
The process of collecting data from various sources, transforming it into a suitable format, and loading it into a data warehouse.
Used in: Data pipelines, machine learning preprocessing
A system of two neural networks competing against each other—one generates content, the other judges it—to create realistic synthetic data.
Use cases: Creating art, generating human faces, data augmentation
An optimization algorithm that iteratively adjusts model parameters to minimize the loss function—finding the best solution step by step.
Analogy: Rolling a ball downhill to find the lowest point
The correct or verified data used as a reference standard for training and evaluating machine learning models.
Example: Labeled images where humans have correctly identified all objects
When an AI model generates incorrect, misleading, or nonsensical information presented as fact. A major challenge in LLMs.
Prevention: Fact-checking, RAG (Retrieval-Augmented Generation)
Settings that control the learning process itself (like learning rate, number of layers), not learned from data but set by humans.
Tuning: Finding the best hyperparameter values through experimentation
The ability of AI to identify objects, people, places, and actions in images. The foundation of computer vision applications.
Applications: Facial recognition, medical diagnosis, autonomous vehicles
The process of using a trained model to make predictions or generate outputs on new, unseen data.
vs Training: Inference is using the model; training is creating it
AI models trained on massive amounts of text data to understand and generate human-like text. Examples include GPT-4, Claude, and Gemini.
Capabilities: Writing, coding, analysis, conversation, translation
A hyperparameter that controls how much to adjust model weights during training. Too high = unstable learning; too low = slow progress.
Analogy: Step size when hiking downhill to find the valley
A measure of how wrong a model's predictions are. The goal of training is to minimize this value.
Examples: Mean Squared Error, Cross-Entropy
A type of recurrent neural network designed to remember information over long sequences, solving the vanishing gradient problem.
Used for: Time series prediction, speech recognition, video analysis
A subset of AI where computers learn patterns from data without being explicitly programmed for specific tasks.
vs Traditional Programming: ML learns rules from data; traditional programming writes rules explicitly
The result of training a machine learning algorithm on data. A model makes predictions or decisions based on input data.
Analogy: A trained chef who can cook based on what they've learned
AI systems that can process and understand multiple types of data—text, images, audio, video—at the same time.
Examples: GPT-4V (vision), Gemini, Claude with vision
A branch of AI that helps computers understand, interpret, and generate human language in valuable ways.
Applications: Translation, sentiment analysis, chatbots, summarization
A computing system inspired by biological neural networks in the brain, consisting of interconnected nodes (neurons) that process information.
Structure: Input layer → Hidden layers → Output layer
The basic unit of a neural network that receives inputs, applies weights, sums them, and passes through an activation function.
Analogy: A single brain cell that processes information
Training a model to recognize a category from just one or very few examples, mimicking human learning efficiency.
Use case: Facial recognition with minimal photos
When a model learns training data too well—including noise and outliers—so it performs poorly on new, unseen data.
Solution: Regularization, dropout, more training data, cross-validation
Internal variables within a model that are learned from training data. Weights and biases are examples of parameters.
vs Hyperparameter: Parameters are learned; hyperparameters are set manually
Using historical data, statistical algorithms, and machine learning to predict future outcomes.
Applications: Stock prediction, risk assessment, customer churn
The practice of crafting effective inputs (prompts) to get desired outputs from language models.
Techniques: Few-shot prompting, chain-of-thought, role prompting
A technique that combines language models with external knowledge bases to improve accuracy and reduce hallucinations.
Benefit: Grounding AI responses in verified information
A neural network designed for sequential data processing where outputs depend on previous computations (memory).
Used for: Time series, speech, text generation
A supervised learning task predicting continuous numerical values rather than categories.
Example: Predicting house prices, stock values, temperature
Training an agent to make decisions by rewarding desired behaviors and penalizing undesired ones through trial and error.
Used in: Game playing (AlphaGo), robotics, autonomous vehicles
Using NLP to determine the emotional tone behind text—positive, negative, or neutral.
Applications: Brand monitoring, customer feedback, social media analysis
Training a model using labeled data where inputs are paired with correct outputs, teaching the model the relationship.
Example: Learning spam detection with labeled spam/not-spam emails
The basic unit of text that AI models process. A token can be a word, part of a word, or punctuation—typically ~4 characters.
Example: "ChatGPT" = 2 tokens ("Chat" + "GPT")
The maximum number of tokens a model can process in a single input/output, including both the prompt and response.
Context: GPT-4 has ~128k token context; helps determine conversation length
Using knowledge gained from one task to improve performance on a related but different task.
Example: Using an ImageNet model as a starting point for medical imaging
A neural network architecture using self-attention mechanisms, the foundation of modern LLMs and many NLP breakthroughs.
Examples: GPT, BERT, Claude, Gemini—all built on Transformers
A test of a machine's ability to exhibit intelligent behavior equivalent to, or indistinguishable from, that of a human.
Origin: Proposed by Alan Turing in 1950
When a model is too simple to learn the underlying patterns in data, performing poorly on both training and new data.
Solution: More complex model, more features, more training time
Training on unlabeled data to find hidden patterns or structures without predefined categories.
Tasks: Clustering, dimensionality reduction, anomaly detection
Evaluating a model during training on a separate dataset to tune hyperparameters and prevent overfitting.
Data split: Training (70%) → Validation (15%) → Test (15%)
A database that stores data as mathematical vectors, enabling fast similarity search and retrieval for AI applications.
Use cases: Semantic search, recommendation systems, RAG
A learnable parameter in a neural network that determines the strength of connection between neurons. Weights are adjusted during training.
Analogy: Synapse strength in biological brains
Now that you understand the terminology, dive deeper into specific AI topics and start building your knowledge.
Or browse all tutorials to find what interests you most.