Prompt Pruning
ELI5 — The Vibe Check
Prompt pruning is cutting unnecessary instructions out of a long prompt without hurting quality. Every word costs tokens and attention. Shorter prompts are often better prompts.
Real Talk
Prompt pruning is the iterative removal of redundant or low-value instructions from a prompt while maintaining or improving output quality. Reduces cost, latency, and attention dilution. Systematic pruning involves removing one instruction at a time and measuring eval impact. Modern frontier models often perform better with concise prompts than elaborate ones.
When You'll Hear This
"Pruned the system prompt from 3000 to 800 tokens — quality went up." / "Before you add more, try prompt pruning."
Related Terms
Prompt Compression
Prompt compression is shrinking a prompt so it fits more context or costs less, without losing meaning.
Prompt Engineering
Prompt engineering is the art of talking to AI so it actually does what you want.
Token Budget
A token budget is the cap on how many tokens a request, session, or user can consume. Like a food budget but for AI.