Learn to define Service Level Indicators (SLIs), set Service Level Objectives (SLOs), and manage error budgets to make data-driven reliability decisions.
An SLI is a quantitative measure of some aspect of the level of service provided. It is the metric that tells you how your service is performing from the user's perspective.
cat <<'EOF'
=== SERVICE LEVEL INDICATORS (SLIs) ===
An SLI is a ratio of good events to total events:
SLI = (good events / total events) * 100%
Common SLI types:
Availability: successful requests / total requests
Latency: requests < 300ms / total requests
Correctness: correct responses / total responses
Throughput: served requests / expected capacity
EOFSLIs measure what users actually experience. Unlike internal metrics like CPU usage, SLIs directly reflect service quality. The key insight is expressing SLIs as ratios (0-100%) so they are comparable across services and easy to set thresholds on. Always measure from the user's perspective, not from the server's.
You see the SLI formula and common SLI types printed to the terminal.