PagerDuty
ELI5 — The Vibe Check
PagerDuty is the tool that wakes you up at 3 AM when production is on fire. It routes alerts to the right on-call person, escalates if nobody responds, and makes sure critical issues never go unnoticed. It's the alarm system for your production infrastructure.
Real Talk
PagerDuty is an incident management platform that aggregates alerts from monitoring tools, routes them based on schedules and escalation policies, and coordinates incident response. It supports on-call management, incident workflows, status pages, and post-incident analysis.
When You'll Hear This
"PagerDuty woke the on-call engineer 2 minutes after the database went down." / "Our escalation policy pages the team lead if the primary on-call doesn't acknowledge within 10 minutes."
Related Terms
Alert Routing
Alert Routing decides which alerts go to which people through which channels. Database down? Page the DBA. API slow? Slack the backend team. Disk full?
Escalation Policy
An Escalation Policy defines what happens when the on-call person doesn't respond. Wait 5 minutes, then page the backup.
On-Call Rotation
On-Call Rotation is a schedule where team members take turns being the person who gets woken up when production breaks.
Opsgenie
Opsgenie is PagerDuty's competitor from Atlassian. It does the same thing — routes alerts, manages on-call schedules, wakes people up.