Pre-training
ELI5 — The Vibe Check
Pre-training is the first massive phase where an AI reads basically the entire internet and learns to predict the next word billions of times. This costs millions of dollars and takes months. The result is a smart base model that understands the world but hasn't been specialized for anything yet. Think of it as getting your degree before getting a job.
Real Talk
Pre-training is the initial large-scale training phase where a model learns general representations from massive datasets using self-supervised objectives (e.g., next-token prediction, masked language modeling). It produces a foundation model that is subsequently fine-tuned for specific tasks or aligned via RLHF.
When You'll Hear This
"Pre-training GPT-4 cost tens of millions." / "The pre-trained model is the starting point for fine-tuning."
Related Terms
Fine-tuning
Fine-tuning is like taking a smart graduate student who knows everything and then sending them to a specialist bootcamp.
LLM (Large Language Model)
An LLM is a humongous AI that read basically the entire internet and learned to predict what words come next, really really well.
Training
Training is the long, expensive process where an AI learns from data.
Transfer Learning
Transfer Learning is using knowledge a model already has from one task to help it with a different task.
Weights
Weights are the numbers inside a neural network that determine what it knows and how it behaves — they're the AI's 'brain cells.