Database Connection Pool Sizing Runbook Builder

Create a practical runbook for sizing, testing, and monitoring database connection pools across apps, workers, serverless functions, and managed databases.

Prompt Template

You are a senior backend engineer and database reliability reviewer. Build a connection pool sizing runbook for:

Database engine and version: [Postgres, MySQL, SQL Server, Aurora, Cloud SQL, RDS, self-hosted]
Application stack: [Node, Rails, Django, Java, Go, serverless, workers, Kubernetes]
Traffic pattern: [steady OLTP, bursty API, cron jobs, background workers, multi-tenant SaaS]
Current pool settings: [pool size, timeout, idle timeout, max lifetime, per-process settings]
Deployment topology: [pods, instances, serverless concurrency, read replicas, PgBouncer, proxy layer]
Database limits: [max_connections, reserved connections, memory, CPU, connection overhead]
Symptoms: [timeouts, exhausted pool, idle connections, slow queries, lock waits, failover issues]
Observed metrics: [active connections, wait time, queue depth, query latency, CPU, memory, throughput]
Constraints: [no downtime, managed DB limits, peak events, compliance, low risk tolerance]

Create:
1. Current-state connection math across every app, worker, and replica
2. Safe sizing formula with assumptions and margin of safety
3. Decision tree for app pool tuning, DB max connection changes, PgBouncer/proxy use, and query optimization
4. Load test plan that validates wait time, saturation, and database health
5. Monitoring dashboard and alert thresholds
6. Rollout plan with staged config changes and rollback steps
7. Incident triage checklist for connection exhaustion
8. Follow-up actions for serverless concurrency, long transactions, and idle connection cleanup
9. Executive summary for reliability and capacity planning

Favor reducing uncontrolled concurrency before raising database limits.

Example Output

# Connection Pool Runbook - SaaS API on Postgres

Connection Math

- API: 8 pods x pool 12 = 96 possible connections.

- Workers: 4 pods x pool 8 = 32 possible connections.

- Admin jobs: 2 pods x pool 5 = 10 possible connections.

- Reserved database capacity: keep 25 connections for migrations, admin, monitoring, and failover.

Recommendation

Set API pool to 8, workers to 6, and add PgBouncer in transaction mode before increasing max_connections. Validate with a 2x normal traffic load test and watch pool wait time, p95 query latency, DB CPU, and lock waits.

Alerting

Page if pool wait p95 exceeds 250 ms for 10 minutes or active DB connections exceed 80% of the safe budget.

Tips for Best Results

  • 💡Count pools per process, pod, worker, and serverless concurrency unit; one setting rarely tells the full story.
  • 💡Ask for database limits and reserved connections before changing max_connections.
  • 💡Treat long queries and idle transactions as pool problems too, not just database capacity problems.