Complete Node.js Buffers & Streams Tutorial
A comprehensive guide to handling binary data and efficient I/O operations in Node.js.
Binary Data
Efficient I/O
Production Ready
Table of Contents
1. What are Buffers?
The Problem Buffers Solve
JavaScript was designed for text, not binary data. When Node.js needs to handle binary data (files, network packets, images), it uses Buffers as temporary raw binary storage.
// Without Buffer - can't handle binary data well
const text = "Hello"; // String is fine for text
// With Buffer - handles any binary data
const binaryData = Buffer.from([0x48, 0x65, 0x6c, 0x6c, 0x6f]); // 'Hello' in hex
What is a Buffer?
A Buffer is a fixed-size chunk of memory allocated outside the V8 JavaScript engine. Think of it as a raw binary container.
// Creating buffers
const buf1 = Buffer.from('Hello World'); // From string
const buf2 = Buffer.alloc(10); // Empty buffer (10 bytes)
const buf3 = Buffer.from([0x48, 0x65]); // From bytes array
console.log(buf1); // <Buffer 48 65 6c 6c 6f 20 57 6f 72 6c 64>
console.log(buf1.toString()); // 'Hello World'
Why Buffers Matter
// Without Buffer - memory inefficient for large data
const largeData = "x".repeat(1000000); // Takes 1MB of JavaScript heap
// With Buffer - more efficient memory management
const efficientBuffer = Buffer.alloc(1000000, 'x'); // Direct memory allocation
- Fixed size (cannot be resized)
- Raw memory allocation (faster)
- Binary data support (images, PDFs, etc.)
- Encoding/decoding capabilities
2. Working with Buffers
Creating Buffers
// Method 1: From string (most common)
const strBuffer = Buffer.from('Node.js');
console.log(strBuffer.length); // 7 (bytes, not characters!)
// Method 2: Empty buffer with size
const emptyBuffer = Buffer.alloc(100); // Safe, initialized to zero
// Method 3: Unsafe but faster (may contain old data)
const unsafeBuffer = Buffer.allocUnsafe(100); // Faster but use carefully
Reading and Writing
const buffer = Buffer.from('ABCDEF');
// Reading
console.log(buffer[0]); // 65 (ASCII code for 'A')
console.log(buffer.toString()); // 'ABCDEF'
console.log(buffer.toString('utf8', 1, 4)); // 'BCD'
console.log(buffer.toString('base64')); // QUJDREVG
// Writing
buffer[0] = 90; // Change 'A' to 'Z'
console.log(buffer.toString()); // 'ZBCDEF'
buffer.write('Hello', 2);
console.log(buffer.toString()); // 'ZBHello' (overwrites)
Common Buffer Operations
// Combining buffers
const hello = Buffer.from('Hello ');
const world = Buffer.from('World');
const combined = Buffer.concat([hello, world]);
console.log(combined.toString()); // 'Hello World'
// Copying buffers
const source = Buffer.from('Source');
const target = Buffer.alloc(10);
source.copy(target, 0);
console.log(target.toString()); // 'Source'
// Comparing buffers
const bufA = Buffer.from('Apple');
const bufB = Buffer.from('Apple');
console.log(bufA.equals(bufB)); // true
// Slicing (shallow copy!)
const original = Buffer.from('Hello World');
const slice = original.slice(0, 5);
slice[0] = 74; // Changes 'H' to 'J'
console.log(original.toString()); // 'Jello World' (original affected!)
Buffer Encodings
const text = 'Hello 世界';
// Different encodings produce different byte lengths
console.log(Buffer.from(text, 'utf8').length); // 12 bytes
console.log(Buffer.from(text, 'ascii').length); // 8 bytes (loses non-ascii)
console.log(Buffer.from(text, 'base64').length); // 9 bytes
const encodings = {
utf8: 'Human-readable text',
base64: 'Binary data as ASCII (email attachments)',
hex: 'Binary as hexadecimal string',
ascii: 'English-only text, 1 byte per char'
};
3. What are Streams?
The Problem Streams Solve
Without streams, big files can crash apps by exhausting memory. Streams process data chunk-by-chunk.
// BAD: loads full file into memory
const fs = require('fs');
const data = fs.readFileSync('huge-file.mp4');
// GOOD: stream chunk-by-chunk
const stream = fs.createReadStream('huge-file.mp4');
stream.on('data', (chunk) => {
console.log(`Processing ${chunk.length} bytes`);
});
What are Streams?
const { Readable } = require('stream');
const simpleStream = new Readable({
read() {
this.push('Chunk 1\n');
this.push('Chunk 2\n');
this.push(null); // End of stream
}
});
simpleStream.on('data', (chunk) => {
console.log('Received:', chunk.toString());
});
Why Use Streams?
| Benefit | Explanation |
|---|---|
| Memory Efficiency | Process chunks, not entire payload |
| Time Efficiency | Start processing before full data arrives |
| Composability | Pipe streams together |
| Backpressure Handling | Manage producer/consumer speed mismatch |
4. Types of Streams
Stream Types Overview
const fs = require('fs');
const readable = fs.createReadStream('file.txt'); // Readable
const writable = fs.createWriteStream('output.txt'); // Writable
const net = require('net');
const socket = net.createConnection(80); // Duplex
const { Transform } = require('stream');
const upperCase = new Transform({
transform(chunk, encoding, callback) {
callback(null, chunk.toString().toUpperCase());
}
});
Readable Streams
const { Readable } = require('stream');
class CounterStream extends Readable {
constructor(max = 10) {
super();
this.current = 1;
this.max = max;
}
_read() {
if (this.current <= this.max) {
this.push(`${this.current}\n`);
this.current++;
} else {
this.push(null);
}
}
}
Writable Streams
const { Writable } = require('stream');
class LoggerStream extends Writable {
_write(chunk, encoding, callback) {
const timestamp = new Date().toISOString();
console.log(`[${timestamp}] ${chunk.toString()}`);
callback();
}
}
Transform Streams + Events
const { Transform, Readable } = require('stream');
const uppercase = new Transform({
transform(chunk, encoding, callback) {
callback(null, chunk.toString().toUpperCase());
}
});
const source = Readable.from(['hello', 'world', 'streams']);
source.pipe(uppercase).pipe(process.stdout);
// Important events: data, end, error, close, finish, drain
5. Using Streams Effectively
Stream Modes
const fs = require('fs');
// Flowing mode
const stream = fs.createReadStream('file.txt');
stream.on('data', (chunk) => console.log(chunk));
// Paused mode
const paused = fs.createReadStream('file.txt');
paused.on('readable', () => {
let chunk;
while ((chunk = paused.read()) !== null) {
console.log(chunk);
}
});
stream.pause();
stream.resume();
Backpressure
const fastRead = fs.createReadStream('large.txt');
const slowWrite = fs.createWriteStream('output.txt', { highWaterMark: 1024 });
fastRead.on('data', (chunk) => {
if (!slowWrite.write(chunk)) {
fastRead.pause();
slowWrite.once('drain', () => fastRead.resume());
}
});
// Better: let pipe handle it
fastRead.pipe(slowWrite);
Object Mode
const { Readable, Transform } = require('stream');
const objectStream = new Readable({
objectMode: true,
read() {
this.push({ id: 1, name: 'Alice' });
this.push({ id: 2, name: 'Bob' });
this.push(null);
}
});
const filter = new Transform({
objectMode: true,
transform(obj, encoding, callback) {
if (obj.id >= 2) callback(null, obj);
else callback();
}
});
6. Piping & Error Handling
pipe() Basics
source.pipe(destination);
source
.pipe(transform1)
.pipe(transform2)
.pipe(destination);
Safe Error Handling with pipeline()
const fs = require('fs');
const zlib = require('zlib');
const { pipeline } = require('stream');
pipeline(
fs.createReadStream('input.txt'),
zlib.createGzip(),
fs.createWriteStream('output.gz'),
(err) => {
if (err) console.error('Pipeline failed:', err);
else console.log('Pipeline succeeded');
}
);
Promise-based pipeline()
const { pipeline } = require('stream/promises');
async function runPipeline() {
try {
await pipeline(
fs.createReadStream('input.txt'),
zlib.createGzip(),
fs.createWriteStream('output.gz')
);
console.log('Success!');
} catch (err) {
console.error('Failed:', err);
}
}
7. Practical Examples
Example 1: File Copy with Progress
const fs = require('fs');
const { PassThrough } = require('stream');
const { pipeline } = require('stream/promises');
async function copyWithProgress(source, destination) {
const stats = await fs.promises.stat(source);
const totalSize = stats.size;
let copiedSize = 0;
const monitor = new PassThrough();
monitor.on('data', (chunk) => {
copiedSize += chunk.length;
const percent = (copiedSize / totalSize * 100).toFixed(1);
process.stdout.write(`\rProgress: ${percent}%`);
});
await pipeline(
fs.createReadStream(source),
monitor,
fs.createWriteStream(destination)
);
console.log('\nCopy complete!');
}
Example 2: Log Analyzer
const { Transform } = require('stream');
class LogAnalyzer extends Transform {
constructor() {
super({ objectMode: true });
this.errorCount = 0;
this.warningCount = 0;
this.infoCount = 0;
}
_transform(line, encoding, callback) {
if (line.includes('ERROR')) this.errorCount++;
else if (line.includes('WARN')) this.warningCount++;
else if (line.includes('INFO')) this.infoCount++;
callback(null, line);
}
}
Example 3: CSV to JSON Converter
const { Transform } = require('stream');
class CSVParser extends Transform {
constructor() {
super({ objectMode: true });
this.headers = null;
this.firstLine = true;
}
_transform(chunk, encoding, callback) {
const lines = chunk.toString().split('\n');
for (const line of lines) {
if (!line.trim()) continue;
const values = line.split(',');
if (this.firstLine) {
this.headers = values;
this.firstLine = false;
} else {
const obj = {};
this.headers.forEach((header, i) => obj[header.trim()] = values[i]?.trim());
this.push(obj);
}
}
callback();
}
}
Example 4: Real-time Data Processing
const { Duplex } = require('stream');
class DataProcessor extends Duplex {
constructor() {
super({ objectMode: true });
this.queue = [];
}
_write(chunk, encoding, callback) {
this.queue.push({
original: chunk,
timestamp: Date.now(),
processed: chunk.value * 2
});
callback();
}
_read() {
while (this.queue.length > 0) {
if (!this.push(this.queue.shift())) break;
}
}
}
8. Best Practices
DO's and DON'Ts
// DO: Use streams for large files
const readStream = fs.createReadStream('huge-file.mp4');
// DON'T: Load entire file into memory
// const data = fs.readFileSync('huge-file.mp4');
// DO: Use pipeline() for robust error handling
// await pipeline(source, transform, dest);
// DO: Set highWaterMark appropriately
const stream = fs.createReadStream('file.txt', { highWaterMark: 1024 * 1024 });
// DO: Always handle errors
stream.on('error', (err) => console.error('Stream error:', err));
Performance Tips
// 1) Good chunk size baseline
const optimalChunkSize = 64 * 1024; // 64KB
// 2) Async iteration
async function processStream(stream) {
for await (const chunk of stream) {
await processChunk(chunk);
}
}
// 3) Combine transforms where practical
// 4) Respect backpressure via pipe() or callback flow in _write()
Common Pitfalls
| Pitfall | Solution |
|---|---|
| Not handling errors | Use pipeline() or explicit handlers |
| Unclosed streams | Call end() / destroy() |
| Blocking transforms | Avoid heavy sync work in _transform |
| Ignoring backpressure | Use pipe() or drain-aware flow |
| Mixed stream modes | Don't mix data and readable carelessly |
Quick Reference Cheatsheet
// Creating streams
fs.createReadStream(path, options);
fs.createWriteStream(path, options);
new Readable(); new Writable(); new Duplex(); new Transform();
// Reading
stream.on('data', callback);
stream.on('readable', callback);
// for await (const chunk of stream) {}
// Writing
stream.write(data);
stream.end([data]);
// Connecting
stream.pipe(destination);
// await pipeline(stream1, stream2, dest);
// States/events
stream.pause(); stream.resume(); stream.destroy();
// data, end, error, close, finish, drain
When to Use Streams vs Buffers
| Use Case | Recommended |
|---|---|
| Small files (< 100KB) | Buffer (simpler) |
| Large files (> 10MB) | Stream (memory efficient) |
| Real-time data | Stream |
| Network communication | Stream |
| Binary manipulation | Buffer |
| Simple text operations | Buffer/String |
Summary
Buffers are for raw binary manipulation in memory. Streams are for chunked, efficient processing of large or continuous data.
const { Transform } = require('stream');
new Transform({
transform(chunk, encoding, callback) {
// chunk is a Buffer
const modified = Buffer.concat([chunk, Buffer.from(' processed')]);
callback(null, modified);
}
});
- Small data: use Buffers directly
- Large data: use Streams
- Complex pipelines: combine both
10 Interview Questions + 10 MCQs
Interview Pattern 10 Q&A1What is a Buffer in Node.js?easy
Answer: A fixed-size binary memory region used to handle raw bytes outside the V8 heap.
2Why are streams preferred for large files?easy
Answer: They process data in chunks, reducing memory usage and improving scalability.
3What does
Buffer.allocUnsafe() risk?mediumAnswer: It may contain stale memory content until overwritten.
4What is backpressure?medium
Answer: Condition where writable side is slower than readable side, causing buffer buildup.
5How does
pipe() help?easyAnswer: It connects streams and handles flow control/backpressure automatically.
6Difference between Duplex and Transform stream?medium
Answer: Duplex reads+writes; Transform is Duplex with built-in data transformation.
7When should you use object mode?medium
Answer: When stream chunks are JavaScript objects instead of Buffers/strings.
8Why prefer
pipeline() over bare pipe() for production?hardAnswer: It centralizes error propagation and cleanup across chained streams.
9What is a common pitfall of
Buffer.slice()?hardAnswer: It creates a view (shared memory), so editing slice can mutate original buffer.
10When is Buffer better than Stream?medium
Answer: For small binary payloads where full in-memory manipulation is simpler.
10 Buffers & Streams MCQs
1
Which method safely allocates a zero-filled buffer?
Explanation:
Buffer.alloc() initializes memory for safety.2
Primary benefit of streams for huge files?
Explanation: Streams avoid loading full data into RAM.
3
Which stream type modifies data in transit?
Explanation: Transform streams change chunks as they pass through.
4What does writable
What does writable write() returning false indicate?
Explanation: Pause source until
drain fires.5
Best API for chained stream error handling?
Explanation:
pipeline() handles propagation and cleanup better.6Which is true about
Which is true about Buffer.slice()?
Explanation: Mutating slice can affect original buffer.
7
Which encoding is most common/default for text buffers?
Explanation: UTF-8 is standard for text in Node.js.
8
When should objectMode be enabled?
Explanation: objectMode allows chunk units to be objects.
9
Most suitable for tiny payload simple ops?
Explanation: Buffers are simpler for small in-memory tasks.
10
Readable + Writable in one stream is called?
Explanation: Duplex supports both read and write capabilities.