Module P-13·20 min read

JSON.parse/stringify edge cases, streaming JSON for large payloads, JSON Schema with Ajv, MessagePack as a binary alternative, and the serialization cost at scale.

Module P-13 — JSON Internals, Serialization, and Schema Validation

What this module covers: JSON is everywhere in Node.js — request bodies, response payloads, configuration files, inter-service communication. Most developers treat it as a black box and pay for it later with silent data loss, unexpected type coercions, and performance problems at scale. This module covers what JSON.parse and JSON.stringify actually do under the hood, the edge cases that corrupt data silently, streaming JSON for payloads too large to hold in memory, fast serialization with fast-json-stringify, JSON Schema validation with Ajv, and when MessagePack is worth the switch.

What JSON.parse and JSON.stringify Actually Do

JSON is a text format. JSON.parse converts a string into a JavaScript object. JSON.stringify does the reverse. The key word is string — everything must pass through text.

This matters because JavaScript has types that JSON doesn't:

JavaScript type	JSON result	What happens
`undefined`	omitted	property disappears
`NaN`	`"null"`	silently becomes null
`Infinity`	`"null"`	silently becomes null
`Date`	`"2024-05-22T14:32:01.123Z"`	becomes a string, not a Date on parse
`BigInt`	throws `TypeError`	crashes your process
`Map`, `Set`	`{}`	serialised as empty objects
`Symbol`	omitted	property disappears
Circular reference	throws `TypeError`	crashes your process

typescript
// Silent data loss
const obj = {
  value: undefined,     // omitted
  notANumber: NaN,      // → null
  infinite: Infinity,   // → null
};
JSON.stringify(obj);
// → '{"notANumber":null,"infinite":null}'
// 'value' is gone entirely

// Date loses its type
const date = new Date('2024-05-22');
const roundTripped = JSON.parse(JSON.stringify({ date }));
typeof roundTripped.date; // 'string', not 'object'
roundTripped.date instanceof Date; // false

// BigInt crashes
JSON.stringify({ id: 9007199254740993n });
// TypeError: Do not know how to serialize a BigInt

// Circular reference crashes
const a: any = {};
const b: any = { a };
a.b = b;
JSON.stringify(a);
// TypeError: Converting circular structure to JSON

Safe Patterns

Handling BigInt

BigInt appears frequently when working with PostgreSQL's BIGSERIAL primary keys or blockchain IDs that exceed Number.MAX_SAFE_INTEGER (2⁵³ - 1 ≈ 9 quadrillion):

typescript
// Option 1: Serialise BigInt as string
JSON.stringify({ id: 9007199254740993n }, (key, value) =>
  typeof value === 'bigint' ? value.toString() : value,
);
// → '{"id":"9007199254740993"}'

// Option 2: Custom replacer — handles nested BigInts
function jsonStringifyBigInt(value: unknown): string {
  return JSON.stringify(value, (_, v) =>
    typeof v === 'bigint' ? v.toString() : v,
  );
}

// Option 3: Override toJSON on a class
class OrderId {
  constructor(public readonly value: bigint) {}
  toJSON() { return this.value.toString(); }
}

Handling Dates

JSON.stringify serialises Dates as ISO strings. JSON.parse gives you strings back. To restore Dates you need a reviver:

typescript
function parseWithDates(json: string): unknown {
  const ISO_DATE = /^\d{4}-\d{2}-\d{2}T\d{2}:\d{2}:\d{2}/;

  return JSON.parse(json, (key, value) => {
    if (typeof value === 'string' && ISO_DATE.test(value)) {
      return new Date(value);
    }
    return value;
  });
}

const obj = parseWithDates('{"createdAt":"2024-05-22T14:32:01.123Z"}');
obj.createdAt instanceof Date; // true

Detecting circular references

typescript
function safeStringify(value: unknown): string | null {
  const seen = new WeakSet();
  try {
    return JSON.stringify(value, (key, val) => {
      if (typeof val === 'object' && val !== null) {
        if (seen.has(val)) return '[Circular]';
        seen.add(val);
      }
      return val;
    });
  } catch {
    return null;
  }
}

The replacer and reviver Parameters

JSON.stringify(value, replacer, space) — replacer controls what gets serialised:

typescript
const user = {
  id: 1,
  name: 'Jatin',
  passwordHash: '$2b$12$xxx',
  email: 'j@example.com',
  internalFlag: true,
};

// Array replacer — whitelist fields
JSON.stringify(user, ['id', 'name', 'email']);
// → '{"id":1,"name":"Jatin","email":"j@example.com"}'
// passwordHash and internalFlag never leave the server

// Function replacer — transform values
JSON.stringify(user, (key, value) => {
  if (key === 'passwordHash') return undefined; // omit
  return value;
});

JSON.parse(text, reviver) — reviver transforms values after parsing:

typescript
// Convert snake_case API responses to camelCase
function camelizeReviver(key: string, value: unknown) {
  if (typeof value === 'object' && value !== null && !Array.isArray(value)) {
    return Object.fromEntries(
      Object.entries(value as object).map(([k, v]) => [
        k.replace(/_([a-z])/g, (_, c) => c.toUpperCase()),
        v,
      ]),
    );
  }
  return value;
}

JSON.parse('{"user_id":1,"created_at":"2024-01-01"}', camelizeReviver);
// → { userId: 1, createdAt: '2024-01-01' }

JSON.stringify Performance at Scale

At low volume, JSON.stringify is invisible. At 10,000 requests/second returning 50-field objects, it becomes measurable.

typescript
// Benchmark comparison (approximate — varies by payload size)
// JSON.stringify (built-in):      ~200ns per call for a simple object
// fast-json-stringify (precompiled): ~50ns per call — 4× faster

fast-json-stringify pre-compiles a serialiser from a JSON Schema — it knows the shape of the output ahead of time and skips runtime introspection:

bash
npm install fast-json-stringify

typescript
import fastJson from 'fast-json-stringify';

const serializeUser = fastJson({
  type: 'object',
  properties: {
    id: { type: 'integer' },
    name: { type: 'string' },
    email: { type: 'string' },
    role: { type: 'string', enum: ['user', 'admin'] },
    createdAt: { type: 'string' },
  },
  required: ['id', 'name', 'email', 'role', 'createdAt'],
});

// In your controller — faster than JSON.stringify for high-throughput routes
res.setHeader('Content-Type', 'application/json');
res.end(serializeUser(user));

Use this on high-traffic endpoints where you know the exact response shape. Not worth the complexity for infrequently called routes.

Streaming JSON for Large Payloads

Loading a 100,000-row query result into memory, calling JSON.stringify, and writing the result is a memory spike waiting to happen. Streaming processes rows as they arrive from the database:

bash
npm install JSONStream

typescript
// src/controllers/export.controller.ts
import { asyncHandler } from '../utils/asyncHandler.js';
import JSONStream from 'JSONStream';
import { pipeline } from 'stream/promises';

export const exportUsers = asyncHandler(async (req, res) => {
  res.setHeader('Content-Type', 'application/json');
  res.setHeader('Content-Disposition', 'attachment; filename="users.json"');

  // Prisma cursor-based streaming
  const userStream = await prisma.user.findManyStream({
    select: { id: true, name: true, email: true, role: true, createdAt: true },
    orderBy: { id: 'asc' },
  });

  // Wrap individual objects into a JSON array, streamed
  const jsonStringifier = JSONStream.stringify('[', ',', ']');

  await pipeline(userStream, jsonStringifier, res);
  // Memory usage: constant regardless of result size
  // Without streaming: O(n) — entire result set in RAM
});

For Postgres with raw SQL, use pg-query-stream:

bash
npm install pg-query-stream

typescript
import QueryStream from 'pg-query-stream';
import { pool } from '../db/pool.js';
import JSONStream from 'JSONStream';
import { pipeline } from 'stream/promises';

export const exportOrders = asyncHandler(async (req, res) => {
  res.setHeader('Content-Type', 'application/json');

  const client = await pool.connect();

  try {
    const query = new QueryStream(
      'SELECT id, user_id, total, status, created_at FROM orders ORDER BY id',
      [],
      { batchSize: 100 },
    );

    const stream = client.query(query);
    await pipeline(stream, JSONStream.stringify('[', ',', ']'), res);
  } finally {
    client.release();
  }
});

JSON Schema Validation with Ajv

Ajv (Another JSON Schema validator) is the fastest JSON Schema validator for Node.js. JSON Schema is the standard for describing the structure of JSON data — more portable than Zod (which is TypeScript-specific) and used widely in OpenAPI and form validation.

bash
npm install ajv ajv-formats

typescript
import Ajv from 'ajv';
import addFormats from 'ajv-formats';

const ajv = new Ajv({ allErrors: true }); // collect all errors, not just the first
addFormats(ajv); // adds 'email', 'date-time', 'uri', 'uuid' formats

const createUserSchema = {
  type: 'object',
  properties: {
    name: { type: 'string', minLength: 1, maxLength: 100 },
    email: { type: 'string', format: 'email' },
    password: { type: 'string', minLength: 8 },
    role: { type: 'string', enum: ['user', 'admin'], default: 'user' },
  },
  required: ['name', 'email', 'password'],
  additionalProperties: false,  // reject unknown fields
};

// Compile once — reuse the compiled validator
const validateCreateUser = ajv.compile(createUserSchema);

// Use in middleware or directly
function validate(schema: object) {
  const validate = ajv.compile(schema);
  return (req, res, next) => {
    if (validate(req.body)) return next();

    const errors = validate.errors!.map(err => ({
      field: err.instancePath.slice(1) || err.params?.missingProperty,
      message: err.message,
    }));
    res.status(400).json({ error: 'Validation failed', issues: errors });
  };
}

Ajv vs Zod: Zod infers TypeScript types and is designed for TypeScript projects. Ajv follows the JSON Schema standard and is faster — it compiles schemas to optimised validation functions. For Node.js APIs already using Zod (as in P-4), stick with Zod. Ajv shines when you need JSON Schema portability (shared schemas between frontend/backend) or raw performance.

MessagePack: Binary Alternative to JSON

MessagePack is a binary serialisation format that is 20–50% smaller than JSON and faster to serialise/deserialise. It supports types that JSON doesn't: integers, floats, binary data, timestamps.

When to consider it:

High-frequency inter-service communication (microservices)
Real-time applications where payload size affects latency
Storing structured data in Redis (smaller footprint)

bash
npm install @msgpack/msgpack

typescript
import { encode, decode } from '@msgpack/msgpack';

// Serialise
const data = { userId: 42, timestamp: new Date(), values: [1, 2, 3] };
const packed = encode(data);   // Uint8Array — binary, smaller than JSON string

// Deserialise
const unpacked = decode(packed) as typeof data;

// Size comparison (approximate)
const json = JSON.stringify(data);
console.log('JSON size:', Buffer.byteLength(json), 'bytes');
console.log('MessagePack size:', packed.byteLength, 'bytes');
// JSON:        ~60 bytes
// MessagePack: ~40 bytes — ~33% smaller

MessagePack in Express

typescript
// Custom middleware to accept and return MessagePack
app.use((req, res, next) => {
  if (req.headers['content-type'] === 'application/msgpack') {
    const chunks: Buffer[] = [];
    req.on('data', chunk => chunks.push(chunk));
    req.on('end', () => {
      req.body = decode(Buffer.concat(chunks));
      next();
    });
  } else {
    next();
  }
});

// res.msgpack() helper
res.msgpack = (data: unknown) => {
  res.setHeader('Content-Type', 'application/msgpack');
  res.end(encode(data));
};

For most REST APIs serving browsers, JSON is the right choice — browsers speak JSON natively, the bandwidth difference is negligible, and debuggability is far better. MessagePack is the right choice for Node-to-Node communication where you control both ends and need to squeeze out latency.

The JSON.parse Cost in Hot Paths

JSON.parse parses the entire string synchronously. On the main thread. A 1MB request body takes ~5ms. At 1,000 req/s, that's 5 seconds of JSON parsing per second on one thread.

Strategies for hot paths:

Validate size before parsing — reject oversized bodies early (express.json({ limit: '10kb' }))
Use fast-json-stringify for serialisation on high-throughput routes
Cache parsed results when the same JSON is parsed repeatedly (e.g., static configuration)
Move heavy parsing to a worker thread for CPU-intensive transformations

typescript
// Worker thread for large JSON transformation
import { Worker, isMainThread, parentPort, workerData } from 'worker_threads';

if (!isMainThread) {
  // Runs in the worker thread
  const result = expensiveTransformation(workerData.json);
  parentPort!.postMessage(result);
}

// In your API
function parseInWorker(json: string): Promise<unknown> {
  return new Promise((resolve, reject) => {
    const worker = new Worker(__filename, { workerData: { json } });
    worker.on('message', resolve);
    worker.on('error', reject);
  });
}

Summary

undefined, NaN, Infinity, BigInt, Date, Map, Set all behave unexpectedly with JSON.stringify. Know the conversion table.
BigInt throws; serialise it as a string. Dates become strings; use a reviver to restore them. Circular references throw; use a replacer with a WeakSet.
replacer as an array is the cleanest way to whitelist response fields — prevents sensitive data leaking from the serialisation layer.
fast-json-stringify pre-compiles serialisers from schemas — 4× faster than built-in JSON.stringify for high-throughput, fixed-shape responses.
Stream JSON for large exports using JSONStream.stringify piped through Prisma/pg stream results — constant memory regardless of result set size.
Ajv is fastest for JSON Schema validation; portable across languages and tools. Use it when you need JSON Schema portability. Use Zod when you want TypeScript inference.
MessagePack is 20–50% smaller and faster than JSON. Worth it for Node-to-Node communication. Not worth it for browser-facing APIs.

Next: Dockerizing Node.js applications for production — multi-stage builds, Alpine vs slim images, non-root users, health checks, and the production readiness checklist for containerised deployments.

PreviousModule P-12: Background Jobs and Task Queues with BullMQ Next Module P-14: Dockerizing Node.js Applications for Production