Hyperparameter
ELI5 — The Vibe Check
Hyperparameters are the settings you configure BEFORE training starts — as opposed to parameters (weights) which the model learns ON ITS OWN. Learning rate, batch size, number of layers: these are hyperparameters. Getting them right is an art form, and getting them wrong is why your model trains for a week and produces garbage.
Real Talk
Hyperparameters are configuration values set before training that govern the learning process itself, rather than being learned from data. Examples include learning rate, batch size, number of epochs, layer count, hidden dimension, dropout rate, and weight decay. Hyperparameter tuning (e.g., via grid search or Bayesian optimization) is a critical step in ML pipelines.
When You'll Hear This
"Tune the hyperparameters on the validation set." / "Learning rate is the most important hyperparameter."
Related Terms
Batch
A batch is a small group of training examples that the model processes at once before updating its weights.
Epoch
An epoch is one complete pass through your entire training dataset. If you have 100,000 examples, one epoch means the model has seen all 100,000 once.
Overfitting
Overfitting is when your model gets TOO good at the training data and becomes useless on new data.
Parameters
Parameters is the technical word for weights — the individual numbers inside an AI model. When someone says 'GPT-4 has 1.
Training
Training is the long, expensive process where an AI learns from data.