Table of contents
Open Table of contents
Context
Imagine a web application that serves 10,000 requests per second. Each request needs to query the database. Without connection pooling, each request would:
- Open a TCP connection to the database server.
- Complete a TLS handshake (if encrypted).
- Authenticate (username, password, protocol negotiation).
- Execute the query.
- Close the connection.
Steps 1–3 alone take 5–30 milliseconds depending on network latency. That is an enormous overhead when the actual query might take only 1–2 ms. Worse, each open connection consumes 5–10 MB of memory on the database server (thread stack, buffers, sort areas). If 10,000 clients each hold a connection, the database needs 50–100 GB of RAM just for connection state — before it has stored a single row.
Without Connection Pooling
~~~~~~~~~~~~~~~~~~~~~~~~~~
Client 1 ─────── TCP+TLS+Auth ──────── DB (5-30ms overhead)
Client 2 ─────── TCP+TLS+Auth ──────── DB (5-30ms overhead)
Client 3 ─────── TCP+TLS+Auth ──────── DB (5-30ms overhead)
... ... ...
Client N ─────── TCP+TLS+Auth ──────── DB (5-30ms overhead)
Total connections at DB: N (one per client)
Memory at DB: N × 5-10 MB
With Connection Pooling
~~~~~~~~~~~~~~~~~~~~~~~
Client 1 ──┐
Client 2 ──┤ ┌──── DB connection 1
Client 3 ──┼──── [ Pool (P conns) ] ──┼──── DB connection 2
... │ ├──── DB connection 3
Client N ──┘ └──── ...
Total connections at DB: P (where P << N)
Memory at DB: P × 5-10 MB
Connection pooling solves both problems. A pool maintains a small set of pre-established connections. Clients “borrow” a connection from the pool, use it, and return it. The expensive setup cost is paid once, then amortized across thousands of requests.
Anatomy of a Connection Pool
Every connection pool, regardless of implementation language, manages the same core state:
+----------------------------------------------------------+
| Connection Pool |
| |
| Configuration: |
| minIdle = 5 (always keep 5 ready) |
| maxPoolSize = 20 (never exceed 20) |
| maxLifetime = 30min (replace after 30 min) |
| idleTimeout = 10min (close if unused 10 min) |
| connTimeout = 5s (wait max 5s for a conn) |
| |
| State: |
| totalConnections = 12 |
| activeConnections = 7 (checked out, in use) |
| idleConnections = 5 (in pool, waiting) |
| waitingThreads = 0 (blocked callers) |
| |
| Idle Queue (LIFO): |
| [ conn_8, conn_5, conn_3, conn_11, conn_2 ] |
| |
+----------------------------------------------------------+
The lifecycle of a pooled connection looks like this:
┌──────────┐ pool has idle? ┌──────────┐
│ Client │ ───── yes ──────────> │ Borrow │
│ requests │ │ (fast) │
│ conn │ ───── no ─────┐ └────┬─────┘
└──────────┘ │ │
v │
┌────────────────┐ │
│ totalConns < │ │
│ maxPoolSize? │ │
└───┬──────┬────┘ │
yes │ │ no │
v v │
┌──────────┐ ┌──────┐ │
│ Create │ │ Wait │ │
│ new conn │ │ (up │ │
│ (slow) │ │ to │ │
└─────┬─────┘ │ 5s) │ │
│ └──┬───┘ │
v v v
┌───────────────────────────┐
│ Client uses conn │
│ (executes SQL, etc.) │
└─────────────┬─────────────┘
│
v
┌───────────────────────────┐
│ Return to pool │
│ - validate (alive?) │
│ - reset state │
│ - push to idle queue │
└───────────────────────────┘
Why LIFO, Not FIFO?
Most high-performance pools use a LIFO (last-in, first-out) idle queue. The most recently returned connection is handed out first. Why?
- Warm TCP connections. A recently-used connection has buffers in the kernel’s TCP stack still warm. The OS hasn’t reclaimed the socket buffer memory.
- Allows idle connections to timeout. If the pool keeps reusing the “top” connections, the ones at the bottom naturally age out and get closed when they hit
idleTimeout. This self-sizes the pool to actual demand. - Better cache locality on the database. The DB server’s thread that served this connection recently may still have relevant data in its CPU cache.
HikariCP: The Fastest JVM Connection Pool
HikariCP is the default connection pool in Spring Boot and one of the most performance-optimized pool implementations. Let’s walk through how it achieves sub-microsecond borrow times.
The ConcurrentBag
HikariCP’s secret weapon is ConcurrentBag — a custom lock-free data structure that avoids the contention of typical BlockingQueue implementations. Here is the simplified architecture:
ConcurrentBag<PoolEntry>
~~~~~~~~~~~~~~~~~~~~~~~~
+------------------------------------+
| ThreadLocal<List<PoolEntry>> | ← each thread caches
| threadList | its own entries
+------------------------------------+
| CopyOnWriteArrayList<PoolEntry> | ← shared list of all
| sharedList | pool entries
+------------------------------------+
| SynchronousQueue<PoolEntry> | ← handoff queue for
| handoffQueue | waiting threads
+------------------------------------+
Borrow path (fast to slow):
1. Check threadLocal list → O(1), no lock
2. Scan sharedList → CAS on entry state
3. Wait on handoffQueue → park thread (timeout)
From com.zaxxer.hikari.util.ConcurrentBag:
public T borrow(long timeout, final TimeUnit timeUnit) throws InterruptedException {
// 1. Try thread-local steal
final List<Object> list = threadList.get();
for (int i = list.size() - 1; i >= 0; i--) { // LIFO scan
final Object entry = list.remove(i);
final T bagEntry = weakThreadLocals ? ((WeakReference<T>) entry).get() : (T) entry;
if (bagEntry != null && bagEntry.compareAndSet(STATE_NOT_IN_USE, STATE_IN_USE)) {
return bagEntry;
}
}
// 2. Scan shared list
final int waiting = waiters.incrementAndGet();
try {
for (T bagEntry : sharedList) {
if (bagEntry.compareAndSet(STATE_NOT_IN_USE, STATE_IN_USE)) {
if (waiting > 1) {
listener.addBagItem(waiting - 1); // signal: need more conns
}
return bagEntry;
}
}
listener.addBagItem(waiting); // all in use → create more
// 3. Wait for handoff
timeout = timeUnit.toNanos(timeout);
do {
final T bagEntry = handoffQueue.poll(timeout, NANOSECONDS);
if (bagEntry == null || bagEntry.compareAndSet(STATE_NOT_IN_USE, STATE_IN_USE)) {
return bagEntry; // null means timeout
}
} while (timeout > 0);
return null;
} finally {
waiters.decrementAndGet();
}
}
The key insight: most borrows never leave step 1. In a typical web server, each request-handling thread borrows and returns connections repeatedly. The thread-local cache means that thread gets “its own” connection back without touching any shared data structure or lock.
Connection Health Checking
When a connection is borrowed, HikariCP must verify it’s still alive. A dead connection (server closed it, network blip, idle timeout on the DB side) would cause the application query to fail. But health checks are expensive — executing SELECT 1 or JDBC4 isValid() takes a network round-trip.
HikariCP’s solution: skip the check if the connection was used recently.
if (elapsedMillis(poolEntry.lastAccessed, now) > aliveBypassWindowMs) {
// Connection sat idle too long — validate it
if (!isConnectionAlive(poolEntry.connection)) {
closeConnection(poolEntry, EVICTED_CONNECTION_MESSAGE);
timeout -= elapsedNanos(startTime);
continue; // try next connection
}
}
The default aliveBypassWindowMs is 500 ms. If a connection was returned less than 500 ms ago, HikariCP trusts it’s still good. This eliminates validation round-trips for hot connections.
PgBouncer: External Connection Multiplexer
While HikariCP lives inside the application, PgBouncer sits between applications and PostgreSQL as a lightweight proxy. It is written in C, uses event-driven I/O (libevent), and handles thousands of client connections with a single thread and minimal memory.
Application Architecture with PgBouncer
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
App Server 1 ──┐
(100 conns) │
│
App Server 2 ──┼───── PgBouncer ────── PostgreSQL
(100 conns) │ (20 server (20 backends)
│ connections)
App Server 3 ──┘
(100 conns)
300 client connections → 20 server connections
Multiplexing ratio: 15:1
Three Multiplexing Modes
PgBouncer offers three modes that trade off between compatibility and efficiency:
Session Mode (most compatible, least efficient)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Client connects ──── gets server conn ──── keeps it until disconnect
Timeline:
Client: |===== session (minutes/hours) =====|
Server: |===== locked to this client ========|
Use when: application uses session-level features (LISTEN/NOTIFY,
prepared statements, temp tables, SET variables)
Transaction Mode (best balance)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Client connects ──── gets server conn per TXN ──── released after COMMIT
Timeline:
Client: |──── idle ────|== TXN ==|──── idle ────|== TXN ==|
Server: |== TXN ==| |== TXN ==|
↑ assigned ↑ returned
Use when: application uses simple queries without session state.
Most common mode for web applications.
Statement Mode (most efficient, most restrictive)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Client gets server conn for a single statement, released immediately.
Timeline:
Client: |─ idle ─|=stmt=|─ idle ─|=stmt=|─ idle ─|=stmt=|
Server: |=stmt=| |=stmt=| |=stmt=|
Use when: autocommit workloads only. Multi-statement transactions
are NOT supported.
PgBouncer Internals: Event Loop
PgBouncer uses a single-threaded event loop. Every client and server socket is registered with libevent. When data arrives, PgBouncer reads the PostgreSQL protocol packets, decides where to route them, and writes them to the target socket.
From the source code, the main data structures in include/bouncer.h:
struct PgSocket {
struct List head; // linked list node
struct PgSocket *link; // paired socket (client↔server)
struct PgPool *pool; // which pool this belongs to
SocketState state; // login, active, idle, etc.
struct event ev; // libevent handle
PktBuf *sbuf; // streaming packet buffer
SBuf sbuf_real; // the actual buffer
usec_t connect_time;
usec_t request_time;
usec_t query_start;
char username[MAX_USERNAME];
char db_name[MAX_DBNAME];
};
struct PgPool {
struct List active_client_list; // clients with queries running
struct List waiting_client_list; // clients waiting for a server
struct List active_server_list; // servers running queries
struct List idle_server_list; // servers available for reuse
struct List used_server_list; // servers recently returned
PgDatabase *db;
PgUser *user;
PoolStats stats;
};
The assignment logic (simplified from janitor.c):
/* Called when a server connection becomes free */
static void release_server(PgSocket *server) {
PgPool *pool = server->pool;
PgSocket *client;
/* Is anyone waiting? */
client = first_socket(&pool->waiting_client_list);
if (client) {
/* Pair them immediately */
list_del(&client->head);
activate_client(client, server);
} else {
/* Nobody waiting — park server in idle list */
list_append(&pool->idle_server_list, &server->head);
server->state = SV_IDLE;
}
}
This is the handoff pattern: when a server connection finishes a transaction (in transaction mode), PgBouncer checks if any client is waiting. If yes, the server is immediately paired with the next waiting client — zero idle time. If nobody is waiting, the server goes to the idle list.
Connection Lifetime Management
Connections don’t live forever. There are several reasons to rotate them:
Connection Lifetime Events
~~~~~~~~~~~~~~~~~~~~~~~~~~
Time ─────────────────────────────────────────────>
│ created maxLifetime reached
│ │ │
v v v
─────[====used====][idle][===used===][idle]──────X──── close & replace
│ │
│ └─── idleTimeout: if this idle gap
│ exceeds threshold, close it
│
└─── validation: on borrow, check if still alive
Why maxLifetime? Even healthy connections should be rotated because:
- DNS changes. If the database endpoint is behind a load balancer or DNS record that changes (e.g., failover), old connections still point to the old server.
- Memory leaks. Some database servers accumulate per-connection memory over time (prepared statement caches, temp table metadata).
- Server-side limits. MySQL’s
wait_timeoutkills connections after inactivity. If the pool doesn’t rotate proactively, borrows hit dead connections. - Rolling restarts. During database upgrades, old connections to shutting-down nodes should drain naturally.
HikariCP adds jitter to maxLifetime to prevent all connections from expiring simultaneously (which would cause a “thundering herd” of new connection attempts):
// From HikariCP's HouseKeeper task
final long variance = maxLifetime > 10_000 ? ThreadLocalRandom.current().nextLong(maxLifetime / 40)
: 0;
final long lifetime = maxLifetime - variance; // ±2.5% jitter
Connection Pool Sizing: The Formula
How many connections should you have? More is not better. PostgreSQL’s official recommendation is:
connections = ((core_count * 2) + effective_spindle_count)
For a modern server with 16 cores and SSDs (no spindles):
connections = (16 * 2) + 1 = 33
This is counterintuitive. Why would 33 connections outperform 500? Because of context switching and lock contention inside the database:
With 33 connections (optimal):
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
CPU core 0: [query A][query B][query C]... ← minimal switching
CPU core 1: [query D][query E][query F]...
...
Lock wait graph: mostly empty
With 500 connections (too many):
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
CPU core 0: [A][ctx][B][ctx][C][ctx][D]... ← constant switching
CPU core 1: [E][ctx][F][ctx][G][ctx][H]...
...
Lock wait graph: deep chains, deadlock risk ↑
Throughput actually DECREASES with more connections!
The database has a fixed number of CPU cores. Each active connection is a thread (in PostgreSQL) or a thread from a thread pool (in MySQL). When there are more active threads than cores, the OS must context-switch between them. Each context switch costs 1–10 microseconds and trashes CPU caches.
Putting It Together: A Request’s Journey
Let’s trace a single HTTP request through a pooled system:
1. HTTP request arrives at app server
│
2. App calls pool.getConnection()
│
├── Pool checks thread-local cache (HikariCP)
│ └── Found! CAS state NOT_IN_USE → IN_USE (< 100ns)
│
3. App executes: SELECT * FROM users WHERE id = 42
│
├── Connection sends query over existing TCP socket
│ (no handshake needed — already established)
│
├── DB processes query (1-2ms)
│
├── Result returned over same socket
│
4. App calls connection.close() ← does NOT close the TCP socket!
│
├── Pool intercepts close()
├── Resets connection state (clear warnings, autocommit=true)
├── Records lastAccessed = now
├── CAS state IN_USE → NOT_IN_USE
└── Places in thread-local list (or handoff to waiting thread)
Total overhead from pooling: ~100-500 nanoseconds
Overhead without pooling: ~5-30 milliseconds
Speedup: 10,000x – 300,000x for connection acquisition
Common Pitfalls
1. Connection leaks. If application code borrows a connection but never returns it (exception thrown before close()), the pool slowly drains. HikariCP detects this with leakDetectionThreshold — if a connection is held longer than N ms, it logs a stack trace of where it was borrowed.
2. Pool exhaustion. If all connections are in use and a thread waits longer than connectionTimeout, it gets a SQLException. This cascades: the HTTP request fails, the user retries, creating more load. Solution: set maxPoolSize based on the database formula above, and use circuit breakers to fail fast.
3. Long transactions holding connections. A single BEGIN ... (long computation) ... COMMIT holds a connection hostage for the entire duration. In transaction-mode PgBouncer, this blocks other clients from using that server connection. Keep transactions short.
4. Misconfigured maxLifetime. If your pool’s maxLifetime is longer than the database’s wait_timeout (MySQL default: 8 hours), the pool will hand out connections that the database has already closed. Set maxLifetime to 2–3 minutes less than the DB timeout.
References
- HikariCP wiki — About Pool Sizing
- PgBouncer documentation — pgbouncer.org
- PostgreSQL wiki — Number Of Database Connections
- HikariCP source — ConcurrentBag.java
- PgBouncer source — bouncer.h
- Oracle — JDBC Connection Pooling Best Practices