Model Routing
ELI5 — The Vibe Check
Model routing is dynamically choosing which AI model to call based on task complexity, cost, or latency — the smart switchboard for LLMs. Simple question? Route to Haiku. Complex reasoning? Escalate to Opus. Time-sensitive? Pick the fastest. You're not locked into one model — you're running a strategy that matches task requirements to model capabilities. Like having three employees with different skill levels and knowing which one to call.
Real Talk
Model routing sits in front of an LLM layer and makes dispatch decisions based on classifiers, heuristics, or a lightweight "router model" that evaluates the incoming prompt. OpenRouter, LiteLLM, and RouteLLM are purpose-built routing layers. Organizations use routing to control cost (cheap models for easy tasks), latency (faster models for UX-critical paths), and capability (specialized models for code, math, or multimodal tasks).
When You'll Hear This
"We implemented model routing — simple queries hit Haiku, complex ones escalate to Sonnet." / "Model routing cut our LLM costs by 60% without touching response quality."
Related Terms
Agent
An AI agent is an LLM that doesn't just answer questions — it takes actions.
LLM (Large Language Model)
An LLM is a humongous AI that read basically the entire internet and learned to predict what words come next, really really well.
Orchestration
Orchestration is the process of automatically managing, coordinating, and scheduling where your containers run.