The most dangerous time for your application is during a deployment. New code is running. Old assumptions might not hold. Users are hitting features that haven't been tested at scale.
Here's how to monitor deployments so you catch issues before users do.
Why Deployment Monitoring Is Different
Normal monitoring tells you if your app is up or down. Deployment monitoring needs to tell you:
- Is the new version behaving? — Not just "is it running"
- Are error rates changing? — Relative to the old version
- Is performance degrading? — Compared to baseline
- Should I roll back? — Based on real signals
Pre-Deployment Checklist
Before You Deploy
- ☐ Baseline metrics recorded (error rate, latency, throughput)
- ☐ Monitoring dashboards open and visible
- ☐ Rollback procedure documented and tested
- ☐ Feature flags configured (if applicable)
- ☐ Team notified of deployment window
- ☐ On-call person aware and available
Metrics to Baseline
| Metric | Why It Matters |
|---|---|
| Error rate (5xx) | Immediate signal of problems |
| p99 latency | Performance degradation |
| Request throughput | Sudden drops indicate issues |
| Database connections | Connection leaks show up fast |
| Memory usage | Leaks take longer but start here |
Deployment Strategies
1. Blue-Green Deployment
Run two identical environments. Switch traffic from old (blue) to new (green).
- Monitoring: Watch green environment before switching traffic
- Rollback: Switch traffic back to blue (seconds)
- Risk: Low, but requires 2x infrastructure
2. Canary Deployment
Route small percentage of traffic to new version, increase gradually.
- Monitoring: Compare canary metrics to baseline
- Rollback: Route traffic back to old version
- Risk: Medium, some users see new version first
3. Rolling Deployment
Replace instances gradually, keeping some old version running.
- Monitoring: Watch each instance as it's replaced
- Rollback: Redeploy old version (minutes)
- Risk: Medium, mixed versions running simultaneously
What to Monitor During Deployment
Immediate Signals (0-5 minutes)
- HTTP 5xx rate — Any increase is a red flag
- Startup errors — Application logs during boot
- Health check failures — New instances not passing checks
- Database connection errors — Schema mismatches
Short-Term Signals (5-15 minutes)
- Latency changes — p50, p95, p99 compared to baseline
- Error rate trends — Is it getting worse?
- User-facing errors — Check error tracking (Sentry, etc.)
- Business metrics — Signups, purchases, conversions
Medium-Term Signals (15-60 minutes)
- Memory growth — Slow leaks take time to surface
- Background job failures — Queues, workers, cron jobs
- External API changes — Rate limits, timeouts
- Customer support tickets — Users noticing problems
Automated Rollback Triggers
Don't rely on humans watching dashboards. Set up automated rollback triggers:
# Example rollback conditions
if error_rate > baseline * 2: rollback()
if p99_latency > baseline * 3: rollback()
if health_check_failures > 3: rollback()
if database_connections > max * 0.9: rollback()
Rollback Trigger Best Practices
- Use sustained thresholds — Single spikes shouldn't trigger rollback
- Compare to baseline — Not absolute thresholds
- Log all rollbacks — For post-mortem analysis
- Test rollback procedure — Before you need it
Feature Flags as Safety Valves
Feature flags let you disable features without rolling back the entire deployment:
When to Use Feature Flags
- New features — Can be turned off if problematic
- Performance changes — A/B test with rollback option
- Third-party integrations — Disable if provider has issues
- High-risk changes — Database schema, algorithm changes
Feature Flag Monitoring
- Track which flags are enabled
- Monitor error rates by flag state
- Alert on flag-related errors
- Have one-click disable for each flag
Post-Deployment Verification
Smoke Tests
Run automated tests against production after deployment:
- Key user journeys (signup, purchase, login)
- Critical API endpoints
- Third-party integrations
- Background job processing
Manual Checks
For high-risk deployments, manually verify:
- Load the homepage
- Complete a test transaction
- Check admin dashboard
- Verify email delivery
- Test mobile app (if applicable)
Deployment Monitoring Checklist
Before
- ☐ Record baseline metrics
- ☐ Open monitoring dashboards
- ☐ Verify rollback procedure works
- ☐ Configure feature flags
- ☐ Notify team
During
- ☐ Watch error rates (real-time)
- ☐ Monitor latency percentiles
- ☐ Check application logs
- ☐ Verify health checks passing
- ☐ Run smoke tests
After (First Hour)
- ☐ Compare metrics to baseline
- ☐ Check error tracking for new issues
- ☐ Verify background jobs processing
- ☐ Monitor business metrics
- ☐ Check customer support channels
After (24 Hours)
- ☐ Review for slow-burn issues (memory leaks)
- ☐ Check user feedback
- ☐ Update runbooks if needed
- ☐ Archive deployment logs
Monitor Your Deployments with Confidence
OpsPulse gives you external uptime monitoring that works alongside your deployment tools. Know immediately if your new version breaks something.
Start Free Monitoring →Summary
Effective deployment monitoring:
- Record baselines — Know your normal before you change it
- Watch immediately — First 15 minutes are critical
- Automate rollback — Don't rely on humans watching dashboards
- Use feature flags — Granular control over new features
- Verify post-deploy — Automated smoke tests + manual checks
Deployments are when things break. Monitor accordingly.