Node.js Performance & Scaling: Comprehensive Theory Guide
Master architecture, event loop behavior, memory, scaling patterns, and production optimization.
Event LoopCachingScaling
Table of Contents
1. Theory: Node.js Architecture Deep Dive
The Node.js architecture is built on three foundational components that determine its performance characteristics:
┌─────────────────────────────────────────────────────────────────┐
│ V8 JavaScript Engine │
│ • Compiles JS to machine code (JIT compilation) │
│ • Manages memory heap and garbage collection │
│ • Provides hidden classes for object property access │
├─────────────────────────────────────────────────────────────────┤
│ libuv Library │
│ • Event loop implementation │
│ • Thread pool for async I/O (default 4 threads) │
│ • Cross-platform asynchronous operations │
├─────────────────────────────────────────────────────────────────┤
│ Node.js Bindings │
│ • Bridges JS land to C++ land │
│ • Exposes system APIs (fs, http, crypto) │
│ • Manages asynchronous request lifecycle │
└─────────────────────────────────────────────────────────────────┘
| Aspect | Traditional Server | Node.js |
|---|---|---|
| Connection Model | One thread per connection | Single thread handles all connections |
| Memory per Connection | ~1MB per thread | ~8KB per connection |
| 10k Concurrent Connections | 10GB RAM | ~80MB RAM |
| Context Switching | Expensive | None (single thread) |
| CPU-bound Tasks | Fine | Blocks everything |
| I/O-bound Tasks | Threads wait | Highly efficient |
// The event loop has six phases, each with a FIFO queue of callbacks
const eventLoopPhases = {
'1. timers': {
purpose: 'Executes setTimeout and setInterval callbacks',
'callback source': 'Timer queue',
blocking: 'Potentially (if callbacks are long)'
},
'2. pending callbacks': {
purpose: 'Executes I/O callbacks deferred to next loop iteration',
'callback source': 'TCP errors, some system operations',
blocking: 'Rarely'
},
'3. idle, prepare': {
purpose: 'Internal use only (not for user code)',
'callback source': 'libuv internals',
blocking: 'No'
},
'4. poll': {
purpose: 'Retrieve new I/O events, execute I/O callbacks',
'callback source': 'Files, networks, databases',
blocking: 'Yes - if no timers pending, will block waiting for I/O'
},
'5. check': {
purpose: 'Executes setImmediate callbacks',
'callback source': 'setImmediate API',
blocking: 'Potentially'
},
'6. close callbacks': {
purpose: 'Executes close event handlers',
'callback source': 'socket.on("close")',
blocking: 'No'
}
};
const microtaskPriority = {
'process.nextTick': 'Highest priority - runs immediately after current operation',
'Promise.then/catch': 'Runs after all nextTick callbacks',
'queueMicrotask': 'Same as Promise callbacks'
};
// The impact of blocking operations
const impactExample = {
scenario: 'Blocking for 100ms',
impact: { requestCount: 1000, blockingTime: 100, result: 'All 1000 requests delayed by 100ms' },
calculation: { formula: 'Max delay = Blocking time × queue_length', example: '100ms × 1000 = 100 seconds worst-case delay' }
};
const blockingOperations = {
'fs.readFileSync': 'Blocks entire server while reading',
'JSON.parse(largeString)': 'CPU-intensive, blocks loop',
'crypto.pbkdf2Sync': 'CPU-bound, blocks for seconds',
'while(true)': 'Permanent block - server stops responding',
'for(i in 1Million) { for(j in 1Million) }': 'Catastrophic blocking'
};
const threadPool = {
defaultSize: 4,
operations: [
'fs.* (file system operations)',
'crypto.pbkdf2, crypto.scrypt',
'zlib compression',
'DNS lookups (getaddrinfo)',
'Some child process operations'
],
calculateOptimalSize: (cpuCores, iowait, expectedConcurrency) => {
const base = require('os').cpus().length * 2;
const ioFactor = iowait > 0.5 ? base * 1.5 : base;
return Math.min(ioFactor, expectedConcurrency);
}
};
2. Theory: Event Loop & Async Patterns
// Priority hierarchy
const priorityHierarchy = {
'1. process.nextTick': {
description: 'Highest priority - executes immediately after current operation',
useCase: 'Emergency error handling, immediate cleanup',
warning: 'Recursive nextTick can starve I/O'
},
'2. Promise callbacks': {
description: 'Next in line, executed in microtask queue',
useCase: 'Standard async flow control',
warning: 'Promise chains can create deep recursion'
},
'3. setTimeout/setInterval': {
description: 'Timer phase',
useCase: 'Delayed execution, polling',
warning: 'Nested timers cause drift'
},
'4. setImmediate': {
description: 'Check phase - executes after I/O callbacks',
useCase: 'Defer I/O-heavy operations',
warning: 'Not actually immediate'
},
'5. I/O callbacks': {
description: 'Poll phase - most network/file operations',
useCase: 'File reads, network requests',
warning: 'Can be starved by microtasks'
}
};
function demonstratePriority() {
console.log('1: Sync');
setTimeout(() => console.log('2: Timer (macrotask)'), 0);
setImmediate(() => console.log('3: setImmediate (check)'));
Promise.resolve().then(() => console.log('4: Promise (microtask)'));
process.nextTick(() => console.log('5: nextTick (microtask)'));
console.log('6: Sync');
// Output: 1, 6, 5, 4, 2, 3
}
// Async patterns performance comparison
const asyncPatterns = {
callbacks: {
pros: ['Fastest for simple cases', 'Low memory overhead'],
cons: ['Hard to read', 'Error handling complexity', 'Inversion of control'],
example: `fs.readFile('a.txt', (err, data) => {
if(err) return cb(err);
fs.writeFile('b.txt', data, cb);
})`,
performance: 'Excellent for simple flows'
},
promises: {
pros: ['Chainable', 'Built-in error handling', 'Composable'],
cons: ['Slightly slower than callbacks', 'Memory overhead for promise objects'],
example: `fs.promises.readFile('a.txt')
.then(data => fs.promises.writeFile('b.txt', data))
.catch(console.error)`,
performance: 'Good - ~15% slower than callbacks'
},
asyncAwait: {
pros: ['Most readable', 'Try/catch works', 'Debuggable'],
cons: ['Same promise overhead', 'Can hide parallelism opportunities'],
example: `try {
const data = await fs.promises.readFile('a.txt');
await fs.promises.writeFile('b.txt', data);
} catch(err) {
console.error(err);
}`,
performance: 'Same as promises'
},
parallel: {
pros: ['Maximum throughput for independent tasks', 'Lower latency'],
cons: ['Resource intensive', 'Error handling complex'],
example: `await Promise.all([fetchUser(), fetchOrders(), fetchProducts()])`,
performance: 'Best for unrelated I/O operations'
}
};
const starvationPrevention = {
badPattern: () => {
function recursivePromise() {
return Promise.resolve().then(recursivePromise);
}
recursivePromise(); // I/O never executes
},
goodPattern: async (items, processor) => {
let index = 0;
const BATCH_SIZE = 100;
while (index < items.length) {
const batch = items.slice(index, index + BATCH_SIZE);
await Promise.all(batch.map(processor));
index += BATCH_SIZE;
await new Promise(resolve => setImmediate(resolve));
}
}
};
3. Theory: Memory Management
const v8Memory = {
limits: {
'32-bit': '~512MB heap',
'64-bit': '~1.4GB heap',
'increase': 'Use --max-old-space-size=4096 flag'
},
generations: {
'New Space (Young Generation)': {
size: '~16-32MB',
purpose: 'Short-lived objects',
collection: 'Scavenge (fast, frequent)'
},
'Old Space (Old Generation)': {
size: 'Rest of heap',
purpose: 'Long-lived objects',
subSpaces: ['Object space', 'Large object space', 'Code space']
}
},
gcTypes: {
'Scavenge (Minor GC)': {
when: 'Young generation fills up',
duration: 'Very fast (10-50ms)',
impact: 'Pauses execution briefly'
},
'Mark-Sweep (Major GC)': {
when: 'Old generation fills up',
duration: 'Slow (100-1000ms)',
impact: 'Significant pause, affects throughput'
},
'Mark-Compact': {
when: 'After mark-sweep, if fragmented',
duration: 'Even slower',
impact: 'Defragments heap'
}
},
gcTriggers: [
'Timer-based (periodic)',
'Allocation failure (out of memory)',
'Process idle time',
'Manual request (global.gc() with flag)'
]
};
const memoryLeaks = {
badGlobal: () => {
function leak() { leakedVar = 'This becomes global'; }
function good() { const localVar = 'This is local'; }
},
timerLeak: {
problem: `
setInterval(() => {
const data = fetchLargeData();
cache.push(data);
}, 1000);
`,
solution: `
const interval = setInterval(handler, 1000);
clearInterval(interval);
`
},
closureLeak: {
problem: 'Closures can retain heavy references',
solution: 'Store only primitive/id values in closures where possible'
},
eventLeak: {
problem: 'Listeners added but never removed',
solution: 'Use emitter.off(...) when done'
},
cacheLeak: {
problem: 'Unlimited cache growth',
solution: 'Use LRU cache with max size'
}
};
const memoryTools = {
getMemoryUsage: () => {
const usage = process.memoryUsage();
return {
rss: `${Math.round(usage.rss / 1024 / 1024)}MB`,
heapTotal: `${Math.round(usage.heapTotal / 1024 / 1024)}MB`,
heapUsed: `${Math.round(usage.heapUsed / 1024 / 1024)}MB`,
external: `${Math.round(usage.external / 1024 / 1024)}MB`
};
},
detectGrowth: (interval = 10000) => {
let baseline = process.memoryUsage().heapUsed;
setInterval(() => {
const current = process.memoryUsage().heapUsed;
const growth = (current - baseline) / 1024 / 1024;
if (growth > 50) console.warn(`Memory grew by ${growth.toFixed(2)}MB`);
baseline = current;
}, interval);
},
heapSnapshot: {
commands: [
'node --inspect app.js',
'Open Chrome DevTools → Memory',
'Take heap snapshot before operation',
'Perform operation',
'Take second snapshot and compare'
]
},
setMemoryLimit: (limitMB) => {
// node --max-old-space-size=4096 app.js
}
};
4. Theory: Scaling Strategies
const scalingDimensions = {
'Vertical Scaling (Scale Up)': {
definition: 'Adding more resources to a single machine',
pros: ['Simple', 'No code changes'],
cons: ['Upper hardware limit', 'Single point of failure']
},
'Horizontal Scaling (Scale Out)': {
definition: 'Adding more instances across multiple machines',
pros: ['Near-infinite scaling', 'Fault tolerance'],
cons: ['Architecture complexity', 'State management challenges']
},
'Functional Scaling (Scale In)': {
definition: 'Splitting app into smaller services',
pros: ['Independent scaling', 'Team autonomy'],
cons: ['Distributed system complexity']
}
};
const loadBalancing = {
'Round Robin': {
description: 'Requests distributed sequentially to each server',
bestFor: 'Servers with similar capacity, stateless apps',
pros: 'Simple, fair distribution',
cons: 'Does not account for server load'
},
'Least Connections': {
description: 'Send request to server with fewest active connections',
bestFor: 'Long-lived connections (WebSockets, streaming)',
pros: 'Adapts to varying request durations',
cons: 'Requires connection tracking'
},
'IP Hash (Sticky Sessions)': {
description: 'Same client routed to same server',
bestFor: 'Session-based apps',
pros: 'Session persistence',
cons: 'Uneven distribution possible'
},
'Weighted Round Robin': {
description: 'Servers get traffic proportion by assigned weight',
bestFor: 'Heterogeneous server capacity',
pros: 'Utilizes varying hardware',
cons: 'Manual weight tuning'
},
'Least Response Time': {
description: 'Route to fastest server with least active load',
bestFor: 'Latency-sensitive systems'
}
};
const stateManagement = {
problem: {
description: 'User session on instance A but next request hits B',
impact: 'Session appears lost'
},
stickySessions: { pros: 'Simple', cons: 'Uneven load and poor failover' },
sharedCache: {
how: 'Sessions in Redis',
code: `
await redis.setex(\`session:\${sessionId}\`, 3600, userData);
const session = await redis.get(\`session:\${sessionId}\`);
`
},
jwtSessions: {
how: 'Session data encoded in token',
code: `
const token = jwt.sign({ userId: 123 }, SECRET, { expiresIn: '1h' });
const decoded = jwt.verify(token, SECRET);
`
},
databaseSessions: {
how: 'Store sessions in PostgreSQL/MySQL',
pros: 'Persistent, ACID compliant',
cons: 'Slower than Redis, more DB pressure'
}
};
const poolingTheory = {
problem: {
description: 'Total DB connections = instanceCount × poolSize',
example: '10 instances × 20 pool = 200 DB connections'
},
calculatePoolSize: {
formula: 'poolSize = (coreCount × 2) + (spindleCount)'
},
pgbouncer: {
description: 'Middleware pools DB connections across all app instances',
architecture: [
'App instances → PgBouncer → PostgreSQL',
'App connects to PgBouncer, not directly to database'
]
}
};
5. Theory: Database Performance
const queryOptimization = {
nPlusOne: {
definition: '1 query for list + N queries for relations',
example: {
bad: ['SELECT * FROM users', 'Then N order queries (one per user)'],
good: ['SELECT * FROM users', 'SELECT * FROM orders WHERE user_id IN (...)']
},
performance: { totalBad: '1000ms', totalGood: '20ms (50x faster)' }
},
indexes: {
bTreeIndex: { complexity: 'O(log n)', bestFor: 'Range and equality' },
hashIndex: { complexity: 'O(1)', bestFor: 'Exact matches only' },
compositeIndex: { rule: 'Column order matters' }
},
explainAnalyze: {
terms: {
'Seq Scan': 'Full table scan',
'Index Scan': 'Uses index',
'Hash Join': 'O(n + m) join strategy'
}
}
};
const connectionPoolTheory = {
lifecycle: {
'Initialization': 'Create min connections',
'Borrow': 'Get connection from pool',
'Return': 'Release back to pool',
'Eviction': 'Close idle connections after timeout'
},
parameters: {
min: 'Minimum connections',
max: 'Maximum connections',
idleTimeoutMillis: 'Idle timeout',
acquireTimeoutMillis: 'Max wait for connection'
},
tuning: {
findOptimal: `
Start with max = (CPU cores × 2) + spindleCount
Increase if waiting queries exist and DB CPU < 70%
Decrease if DB CPU > 70%
`
}
};
const readReplicaTheory = {
patterns: {
'Simple Replication': { description: 'One master, multiple replicas' },
'Chain Replication': { description: 'Master → Replica 1 → Replica 2' }
},
routing: {
'Round Robin': 'Distribute reads evenly',
'Random': 'Simple distribution',
'Latency-based': 'Route to fastest replica',
'Consistent Hash': 'Same user/query route for locality'
},
lagSolutions: {
'Session consistency': 'Read from master after write for N seconds',
'Monotonic reads': 'Use same replica per user',
'Write-read own writes': 'Read your writes from primary immediately',
'Quorum reads': 'Read multiple replicas, use latest'
}
};
6. Theory: Caching Architecture
const cacheLevels = {
'L1 Memory': {
latency: '< 0.1 microseconds',
capacity: 'Limited by heap',
scope: 'Single process',
useCase: 'Frequently accessed small objects'
},
'L2 Distributed': {
latency: '~0.5 milliseconds',
capacity: 'GB to TB',
scope: 'All instances',
useCase: 'Session data, shared API responses'
},
'L3 CDN': {
latency: '~10 milliseconds',
scope: 'Global edge',
useCase: 'Static assets and cacheable GET APIs'
},
'L4 Database': {
latency: '~1 millisecond',
scope: 'Database buffers'
}
};
const invalidationStrategies = {
ttl: {
how: 'Set expiry per key',
example: `
cache.set('products', data, 3600);
cache.set('stock:123', stock, 60);
`
},
writeThrough: {
how: 'Write DB + cache together',
code: `
await db.update('users', { id }, data);
await cache.set(\`user:\${id}\`, data);
`
},
writeBehind: {
how: 'Write to cache first, persist async',
code: `
await cache.set(\`user:\${id}\`, data);
await queue.add('db-update', { id, data });
`
},
cacheAside: {
how: 'Read cache first, fallback to DB',
code: `
let user = await cache.get(\`user:\${id}\`);
if (!user) {
user = await db.findUser(id);
await cache.set(\`user:\${id}\`, user);
}
`
},
tagBased: {
how: 'Invalidate groups via tags',
code: `
await cache.set('product:123', data, { tags: ['products'] });
await cache.invalidateTag('products');
`
}
};
const stampedePrevention = {
problem: {
description: 'Cache expires and many requests hit DB simultaneously'
},
mutex: {
how: 'Only one request computes miss',
code: `
const lockKey = \`lock:\${key}\`;
const acquired = await redis.setnx(lockKey, 'locked', 'PX', 5000);
`
},
probabilistic: {
how: 'Recompute before hard expiry',
formula: 'probability = min(0.5, (elapsed/ttl) * 0.1)'
},
staleWhileRevalidate: {
how: 'Serve stale value while async refreshing',
code: `
if (Date.now() - metadata.created > ttl) {
setImmediate(() => fetcher().then(data => cache.set(key, data)));
}
return value;
`
}
};
7. Theory: Network Optimization
const tcpOptimization = {
connectionCost: {
'DNS lookup': '20-100ms',
'TCP handshake': '~50ms',
'TLS handshake': '~150ms',
'Total cost': 'Up to 300ms before data transfer'
},
keepAlive: {
what: 'Reuse TCP connections',
impact: 'Eliminates repeated handshake overhead',
code: `
const agent = new http.Agent({ keepAlive: true, maxSockets: 50 });
const request = http.request({ agent, host: 'api.example.com' });
`
},
http2: {
features: ['Multiplexing', 'Server push', 'Header compression (HPACK)', 'Binary protocol'],
impact: 'Up to 50% improvement for complex pages'
},
socketOptions: {
noDelay: 'Disable Nagle algorithm',
keepAlive: 'Detect dead connections',
timeout: 'Set idle timeout',
code: `
socket.setNoDelay(true);
socket.setKeepAlive(true, 60000);
socket.setTimeout(30000);
`
}
};
const compressionTheory = {
tradeoff: {
'Without compression': 'Less CPU, more bandwidth',
'With compression': 'More CPU, less bandwidth'
},
algorithms: {
gzip: { compression: '70-80%', speed: 'Medium' },
brotli: { compression: '75-85%', speed: 'Slow' },
deflate: { compression: '70-75%', speed: 'Fast' }
},
decisionMatrix: {
compress: ['Text responses', 'Responses > 1KB', 'Slow clients'],
skipCompression: ['Images/video/audio', 'Very small responses']
},
strategies: {
dynamic: 'Compress on each request (CPU expensive)',
preCompressed: 'Serve prebuilt .gz/.br assets',
hybrid: 'Cache compressed payload in Redis'
}
};
8. Best Practices & Anti-patterns
const antiPatterns = {
syncOperations: {
problem: 'Sync fs/crypto in request handler',
impact: 'Blocks event loop',
solution: 'Use async APIs or worker threads'
},
unboundedCache: {
problem: 'Unlimited Map/Array growth',
impact: 'OOM crashes',
solution: 'Use LRU with max size'
},
nPlusOne: {
problem: 'Queries inside loops',
impact: 'O(n²) DB operations',
solution: 'Batch queries / JOIN'
},
largeSerialization: {
problem: 'JSON.stringify on huge objects',
impact: 'CPU spike and event loop block',
solution: 'Use streaming and pagination'
},
promiseFlood: {
problem: 'Promise.all on huge lists',
impact: 'Resource exhaustion',
solution: 'Batch with concurrency limits'
},
excessiveLogging: {
problem: 'Verbose logging in hot loops',
impact: 'I/O blocking and disk pressure',
solution: 'Structured logging with levels and sampling'
}
};
const performanceChecklist = {
architecture: [
'✓ Use cluster mode or PM2 for multi-core',
'✓ Implement read-replicas for database',
'✓ Add Redis for session/cache',
'✓ Set up CDN for static assets',
'✓ Use message queue for background jobs'
],
codeOptimization: [
'✓ Avoid synchronous functions',
'✓ Use streams for large data',
'✓ Implement pagination for lists',
'✓ Batch database operations',
'✓ Use connection pooling',
'✓ Cache expensive computations',
'✓ Compress HTTP responses'
],
memoryManagement: [
'✓ Set reasonable cache TTLs',
'✓ Implement LRU for caches',
'✓ Clear intervals/timeouts properly',
'✓ Remove event listeners when done',
'✓ Monitor memory usage in production'
],
database: [
'✓ Add indexes for WHERE/ORDER BY',
'✓ Use EXPLAIN to analyze queries',
'✓ Avoid SELECT *',
'✓ Implement connection pooling',
'✓ Set statement timeouts',
'✓ Use batch inserts'
],
productionConfig: [
'✓ NODE_ENV=production',
'✓ Enable gzip/brotli',
'✓ Tune Node memory limits',
'✓ Use PM2/cluster',
'✓ Configure reverse proxy',
'✓ Implement health checks'
],
monitoring: [
'✓ Track event loop lag',
'✓ Monitor memory usage trends',
'✓ Set up APM',
'✓ Log slow queries (>100ms)',
'✓ Alert on error rate spikes'
]
};
const performanceBudgets = {
api: { p50: '< 50ms', p95: '< 200ms', p99: '< 500ms', timeout: '30s' },
throughput: { development: '> 100 RPS', staging: '> 1000 RPS', production: '> 5000 RPS' },
resources: { memory_heap: '< 512MB', memory_rss: '< 1GB', cpu_usage: '< 70%', event_loop_lag: '< 50ms' },
database: { query_time: '< 100ms', connection_pool: '10-50', replica_lag: '< 1s' }
};
# Cluster mode
node cluster.js
# PM2 commands
pm2 start app.js -i max
pm2 scale app 4
pm2 reload app
# Increase memory
node --max-old-space-size=4096 app.js
# Enable GC logging
node --trace-gc app.js
# CPU profiling
node --prof app.js
node --prof-process isolate-0x*.log > profile.txt
# Heap snapshot
node --inspect app.js
# Load testing
npx autocannon -c 100 -d 10 http://localhost:3000
10 Interview Questions + 10 MCQs
1Why is Node.js efficient for I/O-bound workloads?easy
Answer: It uses non-blocking I/O with an event loop, so requests do not occupy threads while waiting for I/O.
2What happens when the event loop is blocked?easy
Answer: All requests and async callbacks are delayed, increasing latency for every client.
3When should you increase `UV_THREADPOOL_SIZE`?medium
Answer: When thread-pool-backed work (fs/crypto/zlib/dns) is saturated and measurable queueing occurs.
4What is the N+1 query problem?medium
Answer: Fetching N records then issuing another query per record, causing excessive DB round trips.
5How does cache-aside work?easy
Answer: Read from cache first; on miss, fetch from DB and populate cache.
6Why use read replicas?medium
Answer: To offload read traffic from primary DB and improve read throughput.
7What is cache stampede?medium
Answer: Many concurrent misses for the same key overwhelm origin systems after expiration.
8When should sticky sessions be used?hard
Answer: When session state is in-process and not externally shared.
9Why avoid synchronous APIs in request paths?easy
Answer: They block the event loop and degrade throughput/latency.
10What metric indicates event loop pressure?hard
Answer: Event loop lag (and its p95/p99 trend) is a direct pressure indicator.
10 Performance & Scaling MCQs
1
Node.js excels at:
Explanation: Event-loop + async I/O is ideal for I/O-heavy workloads.
2
Which blocks the event loop?
Explanation: Sync file I/O blocks the main thread.
3
Default libuv thread pool size:
Explanation: Node defaults to 4 threads for threadpool-backed tasks.
4
N+1 issue is primarily:
Explanation: It causes excessive query counts and latency.
5
Cache-aside means:
Explanation: Read-through manually via cache then source.
6
Best anti-stampede strategy:
Explanation: These strategies smooth recomputation load.
7
Read replicas help with:
Explanation: Replicas offload read traffic from primary.
8
Good pool-size tuning is based on:
Explanation: Tune by system feedback, not guesses.
9
Which is a memory leak pattern?
Explanation: No eviction policy causes unlimited growth.
10
`keepAlive` mainly reduces:
Explanation: It reuses TCP connections across requests.