Back/Module P-13 JSON Internals, Serialization, and Schema Validation
Module P-13·20 min read

JSON.parse/stringify edge cases, streaming JSON for large payloads, JSON Schema with Ajv, MessagePack as a binary alternative, and the serialization cost at scale.

Module P-13 — JSON Internals, Serialization, and Schema Validation

What this module covers: JSON is everywhere in Node.js — request bodies, response payloads, configuration files, inter-service communication. Most developers treat it as a black box and pay for it later with silent data loss, unexpected type coercions, and performance problems at scale. This module covers what JSON.parse and JSON.stringify actually do under the hood, the edge cases that corrupt data silently, streaming JSON for payloads too large to hold in memory, fast serialization with fast-json-stringify, JSON Schema validation with Ajv, and when MessagePack is worth the switch.


What JSON.parse and JSON.stringify Actually Do

JSON is a text format. JSON.parse converts a string into a JavaScript object. JSON.stringify does the reverse. The key word is string — everything must pass through text.

This matters because JavaScript has types that JSON doesn't:

JavaScript typeJSON resultWhat happens
undefinedomittedproperty disappears
NaN"null"silently becomes null
Infinity"null"silently becomes null
Date"2024-05-22T14:32:01.123Z"becomes a string, not a Date on parse
BigIntthrows TypeErrorcrashes your process
Map, Set{}serialised as empty objects
Symbolomittedproperty disappears
Circular referencethrows TypeErrorcrashes your process
typescript
// Silent data loss const obj = { value: undefined, // omitted notANumber: NaN, // → null infinite: Infinity, // → null }; JSON.stringify(obj); // → '{"notANumber":null,"infinite":null}' // 'value' is gone entirely // Date loses its type const date = new Date('2024-05-22'); const roundTripped = JSON.parse(JSON.stringify({ date })); typeof roundTripped.date; // 'string', not 'object' roundTripped.date instanceof Date; // false // BigInt crashes JSON.stringify({ id: 9007199254740993n }); // TypeError: Do not know how to serialize a BigInt // Circular reference crashes const a: any = {}; const b: any = { a }; a.b = b; JSON.stringify(a); // TypeError: Converting circular structure to JSON

Safe Patterns

Handling BigInt

BigInt appears frequently when working with PostgreSQL's BIGSERIAL primary keys or blockchain IDs that exceed Number.MAX_SAFE_INTEGER (2⁵³ - 1 ≈ 9 quadrillion):

typescript
// Option 1: Serialise BigInt as string JSON.stringify({ id: 9007199254740993n }, (key, value) => typeof value === 'bigint' ? value.toString() : value, ); // → '{"id":"9007199254740993"}' // Option 2: Custom replacer — handles nested BigInts function jsonStringifyBigInt(value: unknown): string { return JSON.stringify(value, (_, v) => typeof v === 'bigint' ? v.toString() : v, ); } // Option 3: Override toJSON on a class class OrderId { constructor(public readonly value: bigint) {} toJSON() { return this.value.toString(); } }

Handling Dates

JSON.stringify serialises Dates as ISO strings. JSON.parse gives you strings back. To restore Dates you need a reviver:

typescript
function parseWithDates(json: string): unknown { const ISO_DATE = /^\d{4}-\d{2}-\d{2}T\d{2}:\d{2}:\d{2}/; return JSON.parse(json, (key, value) => { if (typeof value === 'string' && ISO_DATE.test(value)) { return new Date(value); } return value; }); } const obj = parseWithDates('{"createdAt":"2024-05-22T14:32:01.123Z"}'); obj.createdAt instanceof Date; // true

Detecting circular references

typescript
function safeStringify(value: unknown): string | null { const seen = new WeakSet(); try { return JSON.stringify(value, (key, val) => { if (typeof val === 'object' && val !== null) { if (seen.has(val)) return '[Circular]'; seen.add(val); } return val; }); } catch { return null; } }

The replacer and reviver Parameters

JSON.stringify(value, replacer, space)replacer controls what gets serialised:

typescript
const user = { id: 1, name: 'Jatin', passwordHash: '$2b$12$xxx', email: 'j@example.com', internalFlag: true, }; // Array replacer — whitelist fields JSON.stringify(user, ['id', 'name', 'email']); // → '{"id":1,"name":"Jatin","email":"j@example.com"}' // passwordHash and internalFlag never leave the server // Function replacer — transform values JSON.stringify(user, (key, value) => { if (key === 'passwordHash') return undefined; // omit return value; });

JSON.parse(text, reviver)reviver transforms values after parsing:

typescript
// Convert snake_case API responses to camelCase function camelizeReviver(key: string, value: unknown) { if (typeof value === 'object' && value !== null && !Array.isArray(value)) { return Object.fromEntries( Object.entries(value as object).map(([k, v]) => [ k.replace(/_([a-z])/g, (_, c) => c.toUpperCase()), v, ]), ); } return value; } JSON.parse('{"user_id":1,"created_at":"2024-01-01"}', camelizeReviver); // → { userId: 1, createdAt: '2024-01-01' }

JSON.stringify Performance at Scale

At low volume, JSON.stringify is invisible. At 10,000 requests/second returning 50-field objects, it becomes measurable.

typescript
// Benchmark comparison (approximate — varies by payload size) // JSON.stringify (built-in): ~200ns per call for a simple object // fast-json-stringify (precompiled): ~50ns per call — 4× faster

fast-json-stringify pre-compiles a serialiser from a JSON Schema — it knows the shape of the output ahead of time and skips runtime introspection:

bash
npm install fast-json-stringify
typescript
import fastJson from 'fast-json-stringify'; const serializeUser = fastJson({ type: 'object', properties: { id: { type: 'integer' }, name: { type: 'string' }, email: { type: 'string' }, role: { type: 'string', enum: ['user', 'admin'] }, createdAt: { type: 'string' }, }, required: ['id', 'name', 'email', 'role', 'createdAt'], }); // In your controller — faster than JSON.stringify for high-throughput routes res.setHeader('Content-Type', 'application/json'); res.end(serializeUser(user));

Use this on high-traffic endpoints where you know the exact response shape. Not worth the complexity for infrequently called routes.


Streaming JSON for Large Payloads

Loading a 100,000-row query result into memory, calling JSON.stringify, and writing the result is a memory spike waiting to happen. Streaming processes rows as they arrive from the database:

bash
npm install JSONStream
typescript
// src/controllers/export.controller.ts import { asyncHandler } from '../utils/asyncHandler.js'; import JSONStream from 'JSONStream'; import { pipeline } from 'stream/promises'; export const exportUsers = asyncHandler(async (req, res) => { res.setHeader('Content-Type', 'application/json'); res.setHeader('Content-Disposition', 'attachment; filename="users.json"'); // Prisma cursor-based streaming const userStream = await prisma.user.findManyStream({ select: { id: true, name: true, email: true, role: true, createdAt: true }, orderBy: { id: 'asc' }, }); // Wrap individual objects into a JSON array, streamed const jsonStringifier = JSONStream.stringify('[', ',', ']'); await pipeline(userStream, jsonStringifier, res); // Memory usage: constant regardless of result size // Without streaming: O(n) — entire result set in RAM });

For Postgres with raw SQL, use pg-query-stream:

bash
npm install pg-query-stream
typescript
import QueryStream from 'pg-query-stream'; import { pool } from '../db/pool.js'; import JSONStream from 'JSONStream'; import { pipeline } from 'stream/promises'; export const exportOrders = asyncHandler(async (req, res) => { res.setHeader('Content-Type', 'application/json'); const client = await pool.connect(); try { const query = new QueryStream( 'SELECT id, user_id, total, status, created_at FROM orders ORDER BY id', [], { batchSize: 100 }, ); const stream = client.query(query); await pipeline(stream, JSONStream.stringify('[', ',', ']'), res); } finally { client.release(); } });

JSON Schema Validation with Ajv

Ajv (Another JSON Schema validator) is the fastest JSON Schema validator for Node.js. JSON Schema is the standard for describing the structure of JSON data — more portable than Zod (which is TypeScript-specific) and used widely in OpenAPI and form validation.

bash
npm install ajv ajv-formats
typescript
import Ajv from 'ajv'; import addFormats from 'ajv-formats'; const ajv = new Ajv({ allErrors: true }); // collect all errors, not just the first addFormats(ajv); // adds 'email', 'date-time', 'uri', 'uuid' formats const createUserSchema = { type: 'object', properties: { name: { type: 'string', minLength: 1, maxLength: 100 }, email: { type: 'string', format: 'email' }, password: { type: 'string', minLength: 8 }, role: { type: 'string', enum: ['user', 'admin'], default: 'user' }, }, required: ['name', 'email', 'password'], additionalProperties: false, // reject unknown fields }; // Compile once — reuse the compiled validator const validateCreateUser = ajv.compile(createUserSchema); // Use in middleware or directly function validate(schema: object) { const validate = ajv.compile(schema); return (req, res, next) => { if (validate(req.body)) return next(); const errors = validate.errors!.map(err => ({ field: err.instancePath.slice(1) || err.params?.missingProperty, message: err.message, })); res.status(400).json({ error: 'Validation failed', issues: errors }); }; }

Ajv vs Zod: Zod infers TypeScript types and is designed for TypeScript projects. Ajv follows the JSON Schema standard and is faster — it compiles schemas to optimised validation functions. For Node.js APIs already using Zod (as in P-4), stick with Zod. Ajv shines when you need JSON Schema portability (shared schemas between frontend/backend) or raw performance.


MessagePack: Binary Alternative to JSON

MessagePack is a binary serialisation format that is 20–50% smaller than JSON and faster to serialise/deserialise. It supports types that JSON doesn't: integers, floats, binary data, timestamps.

When to consider it:

  • High-frequency inter-service communication (microservices)
  • Real-time applications where payload size affects latency
  • Storing structured data in Redis (smaller footprint)
bash
npm install @msgpack/msgpack
typescript
import { encode, decode } from '@msgpack/msgpack'; // Serialise const data = { userId: 42, timestamp: new Date(), values: [1, 2, 3] }; const packed = encode(data); // Uint8Array — binary, smaller than JSON string // Deserialise const unpacked = decode(packed) as typeof data; // Size comparison (approximate) const json = JSON.stringify(data); console.log('JSON size:', Buffer.byteLength(json), 'bytes'); console.log('MessagePack size:', packed.byteLength, 'bytes'); // JSON: ~60 bytes // MessagePack: ~40 bytes — ~33% smaller

MessagePack in Express

typescript
// Custom middleware to accept and return MessagePack app.use((req, res, next) => { if (req.headers['content-type'] === 'application/msgpack') { const chunks: Buffer[] = []; req.on('data', chunk => chunks.push(chunk)); req.on('end', () => { req.body = decode(Buffer.concat(chunks)); next(); }); } else { next(); } }); // res.msgpack() helper res.msgpack = (data: unknown) => { res.setHeader('Content-Type', 'application/msgpack'); res.end(encode(data)); };

For most REST APIs serving browsers, JSON is the right choice — browsers speak JSON natively, the bandwidth difference is negligible, and debuggability is far better. MessagePack is the right choice for Node-to-Node communication where you control both ends and need to squeeze out latency.


The JSON.parse Cost in Hot Paths

JSON.parse parses the entire string synchronously. On the main thread. A 1MB request body takes ~5ms. At 1,000 req/s, that's 5 seconds of JSON parsing per second on one thread.

Strategies for hot paths:

  1. Validate size before parsing — reject oversized bodies early (express.json({ limit: '10kb' }))
  2. Use fast-json-stringify for serialisation on high-throughput routes
  3. Cache parsed results when the same JSON is parsed repeatedly (e.g., static configuration)
  4. Move heavy parsing to a worker thread for CPU-intensive transformations
typescript
// Worker thread for large JSON transformation import { Worker, isMainThread, parentPort, workerData } from 'worker_threads'; if (!isMainThread) { // Runs in the worker thread const result = expensiveTransformation(workerData.json); parentPort!.postMessage(result); } // In your API function parseInWorker(json: string): Promise<unknown> { return new Promise((resolve, reject) => { const worker = new Worker(__filename, { workerData: { json } }); worker.on('message', resolve); worker.on('error', reject); }); }

Summary

  • undefined, NaN, Infinity, BigInt, Date, Map, Set all behave unexpectedly with JSON.stringify. Know the conversion table.
  • BigInt throws; serialise it as a string. Dates become strings; use a reviver to restore them. Circular references throw; use a replacer with a WeakSet.
  • replacer as an array is the cleanest way to whitelist response fields — prevents sensitive data leaking from the serialisation layer.
  • fast-json-stringify pre-compiles serialisers from schemas — 4× faster than built-in JSON.stringify for high-throughput, fixed-shape responses.
  • Stream JSON for large exports using JSONStream.stringify piped through Prisma/pg stream results — constant memory regardless of result set size.
  • Ajv is fastest for JSON Schema validation; portable across languages and tools. Use it when you need JSON Schema portability. Use Zod when you want TypeScript inference.
  • MessagePack is 20–50% smaller and faster than JSON. Worth it for Node-to-Node communication. Not worth it for browser-facing APIs.

Next: Dockerizing Node.js applications for production — multi-stage builds, Alpine vs slim images, non-root users, health checks, and the production readiness checklist for containerised deployments.

Discussion