Temperature
ELI5 — The Vibe Check
Temperature controls how creative (or chaotic) an AI's responses are. Low temperature (like 0.1) makes it boring, safe, and predictable — great for code. High temperature (like 1.5) makes it creative, surprising, and occasionally unhinged — great for poetry. It's the AI's personality dial.
Real Talk
Temperature is a hyperparameter that scales the logits before softmax in token sampling. A temperature of 0 makes the model deterministic (always picks the highest probability token). Higher temperatures flatten the probability distribution, making lower-probability tokens more likely to be sampled, increasing diversity and creativity.
Show Me The Code
response = client.chat.completions.create(
model="gpt-4",
messages=[{"role": "user", "content": "Write a haiku about bugs"}],
temperature=0.9 # creative mode
)
When You'll Hear This
"Set temperature to 0 for deterministic outputs." / "Crank up the temperature if the responses feel too robotic."
Related Terms
Inference
Inference is when the AI actually runs and generates output — as opposed to training, which is when it's learning.
LLM (Large Language Model)
An LLM is a humongous AI that read basically the entire internet and learned to predict what words come next, really really well.
Token
In AI-land, a token is a chunk of text — roughly 3/4 of a word.
Top-k
Top-k limits the AI's word choices to the K most likely options. If K is 50, the AI only picks from the top 50 most probable words for each step.
Top-p
Top-p (also called nucleus sampling) is another dial that controls how an AI picks its next word.