Webhooks are the invisible glue of modern applications. They connect payment processors to your database, sync CRM data, trigger deployments, and power countless integrations. But when webhooks fail silently, everything breaks — and you often don't find out until users complain.
Here's how to monitor webhooks properly and catch failures before they cascade.
Why Webhook Failures Are Dangerous
Unlike API calls where you control the request, webhooks are asynchronous. You send them and hope for the best. This creates several problems:
- No immediate feedback — A failed webhook doesn't block your code
- Silent data loss — Payment confirmations, user signups, or critical events can disappear
- Hard to debug — The failure happened somewhere else, possibly hours ago
- User-facing impact — Orders show as unpaid, accounts aren't provisioned, syncs break
How Webhooks Fail
Understanding failure modes helps you monitor for them:
1. Network Failures
- DNS resolution issues
- Connection timeouts
- SSL/TLS handshake failures
- Firewall blocks
2. Server-Side Issues
- 500 errors from the receiving application
- Endpoint returning invalid responses
- Rate limiting (429 errors)
- Service temporarily down
3. Configuration Problems
- Wrong endpoint URL
- Changed URL without updating webhook config
- Authentication failures
- Signature verification failures
4. Payload Issues
- Malformed JSON
- Missing required fields
- Payload too large
- Encoding problems
Webhook Monitoring Strategy
Effective webhook monitoring covers three areas:
1. Delivery Monitoring
Track whether webhooks reach their destination:
- HTTP response codes (200, 4xx, 5xx)
- Response time
- Retry attempts and outcomes
- Final success/failure status
2. Content Monitoring
Verify webhook payloads are correct:
- Required fields present
- Valid JSON structure
- Signature verification
- Timestamp freshness (replay attack prevention)
3. Business Logic Monitoring
Confirm webhooks achieve their purpose:
- Payment webhooks → order status updated
- Signup webhooks → account created
- Sync webhooks → data reflected in database
Implementing Webhook Monitoring
Step 1: Log All Webhook Activity
Record every webhook event with full details:
{
"webhook_id": "wh_abc123",
"timestamp": "2026-03-19T23:00:00Z",
"event_type": "payment.completed",
"endpoint": "https://api.example.com/webhooks/stripe",
"payload_size": 1247,
"response_code": 200,
"response_time_ms": 145,
"attempt": 1,
"max_attempts": 5,
"status": "delivered"
}
Step 2: Track Success Rates
Monitor delivery success over time to spot trends:
| Metric | Target | Alert Threshold |
|---|---|---|
| Delivery success rate | >99% | <95% |
| Response time (p95) | <500ms | >2000ms |
| Retry rate | <5% | >10% |
| Failed permanently | <0.1% | >0.5% |
Step 3: Implement Smart Retry Logic
Not all failures should trigger immediate retries:
const shouldRetry = (statusCode) => {
// Retry on server errors and rate limits
if (statusCode >= 500 || statusCode === 429) return true;
// Don't retry on client errors (bad request, not found, etc.)
if (statusCode >= 400) return false;
// Retry on network failures (no status code)
return true;
};
const getRetryDelay = (attempt) => {
// Exponential backoff: 1m, 5m, 15m, 1h, 6h
const delays = [60000, 300000, 900000, 3600000, 21600000];
return delays[Math.min(attempt - 1, delays.length - 1)];
};
Step 4: Set Up Dead Letter Queues
When webhooks fail permanently, store them for manual review:
- Preserve full payload — You may need to reprocess later
- Include failure reason — Helps with debugging
- Add retry button — Allow manual reprocessing
- Set up alerts — Know when dead letters accumulate
Monitoring Outbound Webhooks
If you send webhooks to other services, you're responsible for their delivery:
What to Monitor
- Queue depth — Are webhooks piling up?
- Delivery lag — How long between event creation and delivery?
- Error rate by endpoint — Which integrations are failing?
- Retry exhaustion — How many webhooks give up after max retries?
Dashboard Example
Webhook Dashboard
├── Delivery Health
│ ├── Success rate (24h): 99.2%
│ ├── Avg response time: 127ms
│ └── Current queue depth: 3
├── Top Failing Endpoints
│ ├── api.clientA.com/webhooks → 12 failures
│ └── app.clientB.com/events → 8 failures
└── Recent Dead Letters
└── 3 webhooks awaiting manual review
Monitoring Inbound Webhooks
When you receive webhooks, you need different monitoring:
1. Endpoint Health
- Response time for webhook endpoints
- Error rate (500s from your handler)
- Queue backup (if async processing)
2. Processing Success
- Signature verification pass rate
- Parse success rate (valid JSON?)
- Business logic completion rate
3. Alerting
- Alert on spike in failed signatures (potential attack)
- Alert on processing errors (your code is broken)
- Alert on unusual volume (something's wrong upstream)
Common Webhook Monitoring Mistakes
Mistake 1: Only Logging Failures
If you only log when things go wrong, you can't calculate success rates or spot trends. Log everything.
Mistake 2: No Retry Logic
A single failed webhook shouldn't mean lost data. Implement exponential backoff with a maximum retry count.
Mistake 3: Ignoring Response Times
Slow responses can indicate problems even if they eventually succeed. Monitor p95 and p99 response times.
Mistake 4: Not Tracking by Endpoint
Overall success rate might look fine, but one critical integration could be failing 50% of the time. Break down metrics by endpoint.
Mistake 5: No Dead Letter Review Process
Dead letters pile up silently. Review them weekly, or set up alerts when they accumulate.
Webhook Monitoring Checklist
- ☐ Log all webhook attempts with full details
- ☐ Track success rate by endpoint
- ☐ Monitor response times (p50, p95, p99)
- ☐ Implement retry logic with exponential backoff
- ☐ Set up dead letter queue for permanent failures
- ☐ Alert on success rate drops below threshold
- ☐ Alert on unusual volume or patterns
- ☐ Review dead letters weekly (or auto-alert)
- ☐ Test webhook endpoints regularly (synthetic monitoring)
- ☐ Document expected webhook schemas and signatures
Monitor Your Webhooks with OpsPulse
Track webhook endpoint health alongside your uptime monitoring. Get alerted when integrations break, not when users complain.
Start Free Monitoring →Summary
Webhook monitoring is essential for modern applications:
- Log everything — Success and failure, with full context
- Implement retries — With exponential backoff and limits
- Monitor by endpoint — One failing integration can hide in overall stats
- Set up dead letters — Catch permanent failures for manual review
- Alert proactively — Know about problems before users do
With proper webhook monitoring, integration issues become minor inconveniences instead of customer-impacting disasters.