Set up Prometheus Alertmanager for production alerting. Write alerting rules based on symptoms, configure routing and receivers, understand grouping, inhibition, and silencing, and learn best practices to avoid alert fatigue.
Understand how alerting works in the Prometheus ecosystem. Prometheus evaluates alert rules, Alertmanager handles notification routing, deduplication, grouping, and silencing.
cat <<'EOF'
=== ALERTING ARCHITECTURE ===
Prometheus Alertmanager Receivers
+-----------+ +---------------+ +--------+
| Alert | fires | Route | | Slack |
| Rules | -------> | Group | ---> | Email |
| (PromQL) | | Deduplicate | | PagerD |
+-----------+ | Silence | | Webhook|
| Inhibit | +--------+
+---------------+
=== ALERT STATES ===
inactive -> Alert condition is false
pending -> Alert condition is true, waiting for "for" duration
firing -> Alert has been true for the full "for" duration -> sent to Alertmanager
EOFPrometheus and Alertmanager have separate responsibilities. Prometheus evaluates PromQL conditions and decides when to fire alerts. Alertmanager decides who to notify, how to group related alerts, when to suppress duplicates, and when to silence known issues. This separation means you can restart Alertmanager without losing alert state in Prometheus, and vice versa.
You see the architecture diagram showing the flow from Prometheus alerting rules through Alertmanager to receivers, plus the three alert states.