Database Monitoring for Small Teams: What to Track & Why

Essential database metrics, alert thresholds, and monitoring strategies that prevent outages without overwhelming you with noise

Published: March 19, 2026 • Reading time: 10 minutes

Your database is the heart of your application. When it slows down, everything slows down. When it fails, everything fails. Yet most small teams don't monitor their databases until something breaks.

Here's what you actually need to track — and what thresholds to set — to catch database issues before they become outages.

Why Database Monitoring Matters

Database problems have a nasty habit of starting small and escalating quickly:

Common scenario: A slow query that worked fine at 1,000 rows suddenly takes 30 seconds at 100,000 rows. Users start seeing timeouts. By the time you notice, you're already in incident mode. Proper monitoring catches this before users do.

Essential Database Metrics to Monitor

You don't need to track everything. Focus on these core metrics:

1. Connection Metrics

Metric What It Means Alert Threshold
Active connections Current open connections >80% of max_connections
Idle connections Connections not in use >50% of pool (possible leak)
Connection wait time Time waiting for available connection >100ms
Connection errors Failed connection attempts Any sustained errors

2. Query Performance

Metric What It Means Alert Threshold
Query latency (p99) 99th percentile query time >1 second
Slow query count Queries exceeding threshold Sustained increase
Query throughput Queries per second Unusual drop or spike
Query errors Failed queries Any sustained errors

3. Storage Metrics

Metric What It Means Alert Threshold
Disk usage % Storage consumed >80% (critical: >90%)
Table size growth Largest tables over time Unusual growth rate
Index bloat Unused space in indexes >30% bloat
Table bloat Dead tuples / unused space >20% dead tuples

4. Replication & Availability

Metric What It Means Alert Threshold
Replication lag Seconds behind primary >5 seconds
Replication status Is replica connected? Disconnected
Primary/replica roles Unexpected role changes Any change

5. Resource Utilization

Metric What It Means Alert Threshold
CPU usage Database process CPU Sustained >80%
Memory usage Buffer pool / cache hit rate <95% cache hit rate
Disk I/O Read/write latency >20ms latency

Setting Up Database Monitoring

Option 1: Built-in Database Tools

Most databases have built-in monitoring capabilities:

-- PostgreSQL: Check active connections
SELECT count(*) FROM pg_stat_activity 
WHERE state = 'active';

-- PostgreSQL: Find slow queries
SELECT query, mean_exec_time, calls 
FROM pg_stat_statements 
ORDER BY mean_exec_time DESC LIMIT 10;

-- MySQL: Check process list
SHOW PROCESSLIST;

-- MySQL: InnoDB status
SHOW ENGINE INNODB STATUS;

Run these periodically and log the results for trend analysis.

Option 2: Database-Specific Exporters

For comprehensive monitoring, use exporters that expose database metrics:

Option 3: APM/Database Monitoring Services

Managed services provide database monitoring out of the box:

Start simple: For small teams, begin with built-in tools and alert on connection count, slow queries, and disk usage. Add more sophisticated monitoring as you grow.

Common Database Monitoring Mistakes

Mistake 1: Only Monitoring Overall Health

"Database is up" isn't enough. A database can be "up" but struggling with connections, slow queries, or disk space. Monitor specific metrics, not just availability.

Mistake 2: Alerting Too Aggressively

A single slow query doesn't need a 2 AM wake-up call. Alert on sustained issues, not momentary spikes. Use "for" durations in your alerts (e.g., "connection pool >80% for 5 minutes").

Mistake 3: Not Tracking Trends

Yesterday's "normal" might be tomorrow's crisis. Track metrics over time to spot gradual changes (table growth, query slowdown, connection creep).

Mistake 4: Ignoring Connection Pooling

Connection pool exhaustion is one of the most common database issues. Monitor your pool (PgBouncer, ProxySQL, etc.) alongside the database itself.

Mistake 5: Not Correlating with App Metrics

Database issues often show up first in application metrics (response time, error rate). Correlate database metrics with app performance for faster debugging.

Database Monitoring Checklist

Immediate (Set Up Today)

Short-Term (This Week)

Ongoing

Alert Fatigue Prevention

Database monitoring can generate a lot of noise. Here's how to keep it manageable:

1. Use Smart Thresholds

Don't alert on absolute values. Alert on sustained issues:

# Bad: Alert immediately
connection_pool > 80%

# Good: Alert if sustained
connection_pool > 80% for 5 minutes

2. Aggregate Related Alerts

If connection count is high, slow queries will likely follow. Group related alerts to reduce noise.

3. Use Severity Levels

4. Deduplicate Alerts

If the same alert fires every minute for an hour, you don't need 60 notifications. Deduplicate and send status updates instead.

Monitor Your Database Endpoints with OpsPulse

Track database health alongside your application uptime. Get alerted on connection issues, slow queries, and storage limits before they become outages.

Start Free Monitoring →

Summary

Effective database monitoring for small teams focuses on:

  1. Connections — Track pool utilization and wait times
  2. Query performance — Monitor latency and slow queries
  3. Storage — Alert before you run out of space
  4. Replication — Catch lag before users see stale data
  5. Resources — CPU, memory, and I/O matter

Start with the basics, add sophistication as you grow, and always optimize for signal over noise.

Related Resources