Status pages seem simple: show whether your service is up or down. But a status page that users trust requires more than a green checkmark. It requires honesty, timing, and consistency.
Why Status Pages Matter
A good status page:
- Reduces support tickets — Users check status before emailing you
- Builds trust — Transparency shows you take reliability seriously
- Documents history — Public incident log shows patterns
- Sets expectations — Users know what's normal
What to Include on Your Status Page
Essential Elements
- Overall status — Operational / Degraded / Outage
- Component status — API, Website, Database, etc.
- Active incidents — What's broken right now
- Recent incidents — History of past 30-90 days
- Uptime percentage — 30-day or 90-day average
Nice-to-Have Elements
- Scheduled maintenance — Upcoming maintenance windows
- Subscribe to updates — Email, SMS, RSS, webhook
- Response time graph — Latency over time
- Regional status — If you have multiple regions
Status States That Mean Something
Use clear, unambiguous status states:
| Status | What It Means |
|---|---|
| Operational | Everything works normally. No known issues. |
| Degraded | Working but slower or with intermittent errors. |
| Outage | Service unavailable or major functionality broken. |
| Maintenance | Scheduled maintenance in progress. |
When to Update Your Status Page
Immediate Triggers
- Complete outage (users can't access service)
- Major feature broken (can't log in, can't pay)
- Significant degradation (>2x normal latency)
Within 15 Minutes
- Partial issues affecting subset of users
- Third-party dependency problems
- Errors above normal threshold
During Incidents
- Update every 30-60 minutes even if nothing changed
- Update immediately when status changes
- Update when you find root cause or have ETA
How to Write Incident Updates
The Structure
## [Time] - [Status]
**Impact:** [What users are experiencing]
**Cause:** [What's causing it, if known]
**Action:** [What we're doing about it]
**Next Update:** [When to expect more information]
Example Updates
Initial detection:
## 14:05 UTC - Investigating
**Impact:** Some users are experiencing slow API responses.
**Cause:** Under investigation.
**Action:** We're examining database performance metrics.
**Next Update:** Within 30 minutes or as we learn more.
Root cause found:
## 14:22 UTC - Identified
**Impact:** API latency is 5-10x normal for approximately 30% of requests.
**Cause:** A database query introduced in the 13:45 deployment is causing lock contention.
**Action:** We're rolling back the deployment now.
**Next Update:** Within 15 minutes.
Resolved:
## 14:35 UTC - Resolved
**Impact:** None. Service is operating normally.
**Cause:** Database lock contention from recent deployment.
**Action:** Rollback completed. We're reviewing our deployment process to prevent recurrence.
**Duration:** 30 minutes (14:05 - 14:35 UTC)
Common Status Page Mistakes
Mistake 1: "All Systems Operational" During Outages
Why it happens: Status page isn't connected to monitoring. Manual updates forgotten.
Fix: Automate status from monitoring, or set a rule: update within 5 minutes of any SEV-1/SEV-2.
Mistake 2: Vague Incident Descriptions
Why it happens: Trying to hide details or don't know the cause yet.
Fix: Say what you know. "Investigating reports of login issues" is better than "Investigating issues."
Mistake 3: Going Dark During Incidents
Why it happens: Busy fixing the problem, forget to update.
Fix: Assign someone to communications during incidents. Update every 30-60 minutes minimum.
Mistake 4: Hiding Incident History
Why it happens: Don't want to show how often things break.
Fix: Show 30-90 days of history. Everyone has incidents. Hiding them makes you look dishonest.
Mistake 5: Jargon and Internal Terms
Why it happens: Engineers writing for engineers.
Fix: Write for your users. "Login issues" not "Auth service 502 errors."
Status Page Checklist
Setup
- ☐ Status page on separate domain/infrastructure
- ☐ Components match what users care about
- ☐ Clear status states with definitions
- ☐ Incident history visible
- ☐ Subscribe option (email/RSS)
During Incidents
- ☐ Update within 5 minutes of detection
- ☐ Be specific about impact
- ☐ Update every 30-60 minutes
- ☐ Post resolution with summary
- ☐ Include duration in resolution
Post-Incident
- ☐ Update final status with cause
- ☐ Link to post-mortem if public
- ☐ Verify uptime calculations are correct
Monitor Your Services, Update Your Status
OpsPulse provides external uptime monitoring. Know when your services are down so you can update your status page before users complain.
Start Free Monitoring →Status Page Services
Don't build your own. Use a dedicated service:
- Atlassian Statuspage — Most popular, feature-rich
- Instatus — Simple, fast, affordable
- CState — Open source, self-hosted
- Status.io — Enterprise features
- GitHub Pages + Static generator — Free, DIY
Summary
A trustworthy status page:
- Is accurate — Don't show green when users see red
- Updates quickly — Within 5 minutes of issues
- Communicates regularly — Every 30-60 minutes during incidents
- Uses plain language — Users understand what's broken
- Shows history — Past incidents build credibility
- Is always available — Hosted separately from your service
Build trust through transparency, even when things break. Especially when things break.