Module P-8·25 min read

Structured logging with Pino, correlation IDs, multi-stage Docker builds, docker-compose for local dev, PM2 cluster mode, and GitHub Actions CI/CD.

Module P-8 — Logging, Observability Basics, and Deployment

What this module covers: console.log doesn't scale. When your API handles thousands of requests per second across multiple servers, you need structured logs you can query, correlation IDs that link a request across every log line, and a deployment pipeline that gets code to production without manual steps. This module covers structured logging with Pino, correlation IDs, multi-stage Docker builds that produce lean production images, docker-compose for local development, PM2 for running Node.js processes in production, and a GitHub Actions pipeline that tests, builds, and deploys on every push to main.

Why Structured Logging

console.log('User logged in: ' + userId) produces a string. You can't query it, you can't filter it, you can't aggregate it. Structured logging produces JSON:

json
{
  "level": "info",
  "time": "2024-05-22T14:32:01.123Z",
  "msg": "User logged in",
  "userId": 42,
  "requestId": "f8a2e9c1-4b5d-4f3a-8e6c-2d1a7b3f9e4c",
  "ip": "203.0.113.42",
  "durationMs": 183
}

Now you can run WHERE msg = 'User logged in' AND durationMs > 500 to find slow logins. You can group by userId to see all activity for a user. You can set up alerts on level = 'error'.

Pino: Fast Structured Logging

Pino is the fastest JSON logger for Node.js. It is designed to minimise work on the hot path — log entries are serialised asynchronously.

bash
npm install pino pino-pretty
npm install -D @types/pino

Logger singleton

typescript
// src/utils/logger.ts
import pino from 'pino';
import { env } from '../config/env.js';

const logger = pino({
  level: env.LOG_LEVEL ?? 'info',

  // In development: pretty-print for readability
  // In production: raw JSON for log aggregators (Datadog, Loki, CloudWatch)
  transport: env.NODE_ENV === 'development'
    ? {
        target: 'pino-pretty',
        options: {
          colorize: true,
          translateTime: 'HH:MM:ss.l',
          ignore: 'pid,hostname',
        },
      }
    : undefined,

  // Redact sensitive fields from all log entries
  redact: {
    paths: ['req.headers.authorization', 'body.password', 'body.passwordHash'],
    censor: '[REDACTED]',
  },

  // Base fields added to every log entry
  base: {
    env: env.NODE_ENV,
    version: process.env.npm_package_version,
  },

  // Serialisers — control how objects are formatted
  serializers: {
    err: pino.stdSerializers.err,  // properly serialise Error objects
    req: pino.stdSerializers.req,
    res: pino.stdSerializers.res,
  },
});

export default logger;

Using the logger

typescript
import logger from '../utils/logger.js';

// Info — normal operations
logger.info({ userId: 42, orderId: 7 }, 'Order created');

// Warn — unexpected but recoverable
logger.warn({ attemptCount: 3, ip: req.ip }, 'Failed login attempt');

// Error — errors that need attention
logger.error({ err, orderId: 7 }, 'Failed to send order confirmation email');

// Debug — verbose, off in production
logger.debug({ query: sql, params }, 'Executing query');

// Child loggers — add context for a specific scope
const requestLogger = logger.child({ requestId: req.requestId });
requestLogger.info('Processing payment');
requestLogger.info({ chargeId: 'ch_xxx' }, 'Payment succeeded');
// Both entries have requestId set automatically

HTTP Request Logging Middleware

Log every incoming request with timing:

typescript
// src/middleware/httpLogger.ts
import { Request, Response, NextFunction } from 'express';
import logger from '../utils/logger.js';

export function httpLogger(req: Request, res: Response, next: NextFunction) {
  const start = Date.now();

  // Log when response finishes — not when request arrives
  res.on('finish', () => {
    const durationMs = Date.now() - start;
    const level = res.statusCode >= 500 ? 'error' : res.statusCode >= 400 ? 'warn' : 'info';

    logger[level]({
      requestId: req.requestId,
      method: req.method,
      url: req.originalUrl,
      statusCode: res.statusCode,
      durationMs,
      ip: req.ip,
      userAgent: req.headers['user-agent'],
      userId: req.user?.id,
    }, 'HTTP request');
  });

  next();
}

typescript
// src/app.ts
app.use(requestId);    // set req.requestId first
app.use(httpLogger);   // then log with the ID

Development output (pino-pretty):

14:32:01.123 INFO  HTTP request
    requestId: "f8a2e9c1"
    method: "POST"
    url: "/orders"
    statusCode: 201
    durationMs: 183
    userId: 42

Correlation IDs Across Services

When a request triggers calls to multiple services (or microservices), correlation IDs let you trace the full journey in logs:

typescript
// src/middleware/requestId.ts
import { randomUUID } from 'crypto';

export function requestId(req, res, next) {
  // Accept correlation ID from upstream (API gateway, frontend) or generate one
  req.requestId = req.headers['x-request-id'] as string ?? randomUUID();
  res.setHeader('x-request-id', req.requestId);
  next();
}

Pass the correlation ID when making outbound service calls:

typescript
export async function callPaymentService(orderId: number, requestId: string) {
  const response = await fetch(`${env.PAYMENT_SERVICE_URL}/charges`, {
    method: 'POST',
    headers: {
      'Content-Type': 'application/json',
      'X-Request-ID': requestId,  // propagate downstream
    },
    body: JSON.stringify({ orderId }),
  });
  return response.json();
}

Now grep requestId=f8a2e9c1 in your log aggregator shows every log line from every service for a single user request.

Multi-Stage Docker Build

A naive Docker build copies node_modules including all dev dependencies, making the image hundreds of megabytes unnecessarily. Multi-stage builds fix this.

dockerfile
# Dockerfile

# ─── Stage 1: Build ──────────────────────────────────────────────────────────
FROM node:22-alpine AS builder

WORKDIR /app

# Install ALL dependencies (including devDependencies for TypeScript)
COPY package*.json ./
RUN npm ci

# Copy source and compile
COPY tsconfig.json ./
COPY src ./src
RUN npm run build   # outputs to /app/dist


# ─── Stage 2: Production image ───────────────────────────────────────────────
FROM node:22-alpine AS production

WORKDIR /app

# Create non-root user — never run Node.js as root
RUN addgroup -g 1001 nodejs && \
    adduser -S -u 1001 -G nodejs nodeuser

# Install only production dependencies
COPY package*.json ./
RUN npm ci --omit=dev && npm cache clean --force

# Copy compiled output from builder stage — no TypeScript source
COPY --from=builder /app/dist ./dist

# Set ownership
RUN chown -R nodeuser:nodejs /app
USER nodeuser

# Document the port (doesn't actually publish it)
EXPOSE 3000

# Use exec form — PID 1 gets SIGTERM, enables graceful shutdown
CMD ["node", "dist/index.js"]

Build and run:

bash
docker build -t myapp:latest .
docker run -p 3000:3000 --env-file .env myapp:latest

Image size comparison:

Without multi-stage: ~800MB (with TypeScript, ts-node, test deps)
With multi-stage: ~150MB (only compiled JS + production deps)

Graceful shutdown in Node.js

typescript
// src/index.ts
import app from './app.js';
import prisma from './db/prisma.js';
import redis from './db/redis.js';
import logger from './utils/logger.js';
import { env } from './config/env.js';

const server = app.listen(env.PORT, () => {
  logger.info({ port: env.PORT }, 'Server started');
});

async function shutdown(signal: string) {
  logger.info({ signal }, 'Shutdown signal received');

  // Stop accepting new connections
  server.close(async () => {
    try {
      // Close database connections cleanly
      await prisma.$disconnect();
      redis.disconnect();
      logger.info('Graceful shutdown complete');
      process.exit(0);
    } catch (err) {
      logger.error({ err }, 'Error during shutdown');
      process.exit(1);
    }
  });

  // Force exit if graceful shutdown hangs
  setTimeout(() => {
    logger.error('Forced shutdown after timeout');
    process.exit(1);
  }, 10_000);
}

process.on('SIGTERM', () => shutdown('SIGTERM'));
process.on('SIGINT', () => shutdown('SIGINT'));
process.on('uncaughtException', (err) => {
  logger.fatal({ err }, 'Uncaught exception');
  process.exit(1);
});
process.on('unhandledRejection', (reason) => {
  logger.fatal({ reason }, 'Unhandled promise rejection');
  process.exit(1);
});

docker-compose for Local Development

docker-compose runs all your local dependencies (Postgres, Redis) with one command:

yaml
# docker-compose.yml
version: '3.9'

services:
  app:
    build:
      context: .
      target: builder   # use the builder stage — includes ts-node for dev
    ports:
      - '3000:3000'
    environment:
      NODE_ENV: development
      DATABASE_URL: postgresql://postgres:postgres@postgres:5432/myapp_dev
      REDIS_URL: redis://redis:6379
      JWT_ACCESS_SECRET: local-dev-secret-32-characters-minimum
      JWT_REFRESH_SECRET: local-dev-refresh-32-characters-min
    volumes:
      - ./src:/app/src   # hot reload — changes reflected instantly
    command: npm run dev
    depends_on:
      postgres:
        condition: service_healthy
      redis:
        condition: service_healthy

  postgres:
    image: postgres:16-alpine
    environment:
      POSTGRES_DB: myapp_dev
      POSTGRES_USER: postgres
      POSTGRES_PASSWORD: postgres
    ports:
      - '5432:5432'
    volumes:
      - postgres_data:/var/lib/postgresql/data
    healthcheck:
      test: ['CMD-SHELL', 'pg_isready -U postgres']
      interval: 5s
      timeout: 5s
      retries: 5

  redis:
    image: redis:7-alpine
    ports:
      - '6379:6379'
    healthcheck:
      test: ['CMD', 'redis-cli', 'ping']
      interval: 5s
      timeout: 5s
      retries: 5

volumes:
  postgres_data:

bash
docker-compose up -d          # start in background
docker-compose logs -f app    # tail logs
docker-compose down           # stop
docker-compose down -v        # stop and wipe data

PM2 for Production Process Management

When deploying to a bare VM or VPS (rather than a container orchestrator), PM2 manages your Node.js process:

bash
npm install -g pm2

javascript
// ecosystem.config.js
module.exports = {
  apps: [
    {
      name: 'myapp',
      script: 'dist/index.js',
      instances: 'max',     // one process per CPU core
      exec_mode: 'cluster', // cluster mode — all processes share port 3000
      max_memory_restart: '500M',

      env_production: {
        NODE_ENV: 'production',
        PORT: 3000,
      },

      // Logging
      out_file: '/var/log/myapp/out.log',
      error_file: '/var/log/myapp/error.log',
      merge_logs: true,
      log_date_format: 'YYYY-MM-DD HH:mm:ss',
    },
  ],
};

bash
pm2 start ecosystem.config.js --env production
pm2 save          # persist config — restart on system reboot
pm2 startup       # generate startup script for your OS
pm2 status        # check running processes
pm2 logs myapp    # tail logs
pm2 reload myapp  # zero-downtime reload (cluster mode)
pm2 monit         # real-time CPU/memory dashboard

Cluster mode vs worker threads: PM2 cluster mode forks multiple Node.js processes, each with their own event loop. This saturates all CPU cores. For CPU-bound work within a single request, worker threads (covered in the Architect phase) are the right tool. For most APIs (I/O bound), PM2 cluster mode is sufficient.

GitHub Actions CI/CD Pipeline

A pipeline that runs on every push: type-check → test → build Docker image → push to registry → deploy:

yaml
# .github/workflows/deploy.yml
name: Test, Build, Deploy

on:
  push:
    branches: [main]
  pull_request:
    branches: [main]

env:
  REGISTRY: ghcr.io
  IMAGE_NAME: ${{ github.repository }}

jobs:
  test:
    runs-on: ubuntu-latest

    services:
      postgres:
        image: postgres:16
        env:
          POSTGRES_DB: myapp_test
          POSTGRES_USER: postgres
          POSTGRES_PASSWORD: postgres
        ports: ['5432:5432']
        options: --health-cmd pg_isready --health-interval 10s

      redis:
        image: redis:7
        ports: ['6379:6379']
        options: --health-cmd "redis-cli ping" --health-interval 10s

    steps:
      - uses: actions/checkout@v4

      - uses: actions/setup-node@v4
        with:
          node-version: '22'
          cache: 'npm'

      - run: npm ci

      - name: Type check
        run: npx tsc --noEmit

      - name: Run tests
        run: npm run test:ci
        env:
          DATABASE_URL: postgresql://postgres:postgres@localhost:5432/myapp_test
          REDIS_URL: redis://localhost:6379
          JWT_ACCESS_SECRET: ci-test-secret-at-least-32-characters
          JWT_REFRESH_SECRET: ci-refresh-secret-at-least-32-characters
          NODE_ENV: test

  build-and-push:
    needs: test
    runs-on: ubuntu-latest
    if: github.ref == 'refs/heads/main'  # only on main branch pushes

    permissions:
      contents: read
      packages: write

    outputs:
      image-tag: ${{ steps.meta.outputs.tags }}

    steps:
      - uses: actions/checkout@v4

      - name: Log in to container registry
        uses: docker/login-action@v3
        with:
          registry: ${{ env.REGISTRY }}
          username: ${{ github.actor }}
          password: ${{ secrets.GITHUB_TOKEN }}

      - name: Extract metadata
        id: meta
        uses: docker/metadata-action@v5
        with:
          images: ${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}
          tags: |
            type=sha,prefix=sha-
            type=raw,value=latest,enable=true

      - name: Build and push Docker image
        uses: docker/build-push-action@v5
        with:
          context: .
          target: production
          push: true
          tags: ${{ steps.meta.outputs.tags }}
          cache-from: type=gha
          cache-to: type=gha,mode=max

  deploy:
    needs: build-and-push
    runs-on: ubuntu-latest
    if: github.ref == 'refs/heads/main'

    steps:
      - name: Deploy to production
        uses: appleboy/ssh-action@v1
        with:
          host: ${{ secrets.DEPLOY_HOST }}
          username: ${{ secrets.DEPLOY_USER }}
          key: ${{ secrets.DEPLOY_SSH_KEY }}
          script: |
            docker pull ghcr.io/${{ github.repository }}:latest
            docker stop myapp || true
            docker rm myapp || true
            docker run -d \
              --name myapp \
              --restart unless-stopped \
              -p 3000:3000 \
              --env-file /etc/myapp/.env \
              ghcr.io/${{ github.repository }}:latest

Repository secrets to configure

In GitHub → Settings → Secrets and variables → Actions:

DEPLOY_HOST — your server IP or hostname
DEPLOY_USER — SSH username (ubuntu, deploy, etc.)
DEPLOY_SSH_KEY — private SSH key (the public key must be in ~/.ssh/authorized_keys on the server)

.dockerignore

Prevent unnecessary files from being copied into the build context — this speeds up builds and prevents secrets from leaking into images:

node_modules
dist
.git
.env
.env.*
*.log
*.test.ts
__tests__
coverage
.github
README.md
docker-compose.yml

Summary

Pino produces JSON logs with near-zero overhead. Never use console.log in production code.
Child loggers (logger.child({ requestId })) attach context to every log line in a scope without repetition.
Correlation IDs set on every request, propagated to outbound calls, and included in every log — make distributed debugging possible.
Multi-stage Docker builds separate the TypeScript compile step from the production image. Result: ~150MB images instead of ~800MB.
Graceful shutdown — handle SIGTERM, close the server, drain connections, disconnect from databases. Required for zero-downtime deployments.
docker-compose — one command starts Postgres, Redis, and your app for local development with hot reload.
PM2 cluster mode — saturates all CPU cores with independent Node.js processes. Zero-downtime reload with pm2 reload.
GitHub Actions — test → build Docker image → push to registry → SSH deploy on every push to main. Gate the deploy step behind test success.

Next: WebSockets and real-time communication — the upgrade from HTTP to WebSocket, broadcasting events to connected clients, presence tracking, and scaling WebSockets across multiple servers with Redis Pub/Sub.

OpenTelemetry — Distributed Traces From Minute One

Pino logs with correlation IDs are necessary. They're not sufficient. When a request to your Express API takes 800ms and you don't know if the time was spent in a database query, an external HTTP call, or application logic, you're debugging blind. Distributed traces tell you exactly where the time went.

OpenTelemetry (OTel) is the industry-standard observability framework — vendor-neutral, supported by every major APM tool (Datadog, New Relic, Jaeger, Grafana Tempo, Honeycomb). Set it up once, export to any backend.

Auto-Instrumentation — Zero Code Changes

The @opentelemetry/auto-instrumentations-node package automatically instruments:

Express (request spans, route attributes)
node-postgres / pg (query spans with SQL text)
ioredis (command spans)
http/https (outbound request spans)
dns (resolution spans)
fs (file operation spans — optional, noisy)

bash
npm install @opentelemetry/sdk-node \
  @opentelemetry/auto-instrumentations-node \
  @opentelemetry/exporter-trace-otlp-http

javascript
// instrumentation.js — must be loaded BEFORE anything else
import { NodeSDK } from '@opentelemetry/sdk-node'
import { getNodeAutoInstrumentations } from '@opentelemetry/auto-instrumentations-node'
import { OTLPTraceExporter } from '@opentelemetry/exporter-trace-otlp-http'

const sdk = new NodeSDK({
  serviceName: process.env.SERVICE_NAME ?? 'api-server',
  traceExporter: new OTLPTraceExporter({
    url: process.env.OTEL_EXPORTER_OTLP_ENDPOINT ?? 'http://localhost:4318/v1/traces',
  }),
  instrumentations: [
    getNodeAutoInstrumentations({
      '@opentelemetry/instrumentation-fs': { enabled: false }, // too noisy
      '@opentelemetry/instrumentation-pg': {
        enhancedDatabaseReporting: true, // includes SQL in span attributes
      },
    }),
  ],
})

sdk.start()

// Graceful shutdown — flush pending spans before process exits
process.on('SIGTERM', () => sdk.shutdown())

Start your application with the instrumentation file loaded:

bash
node --import ./instrumentation.js src/index.js
# or in package.json scripts:
"start": "node --import ./instrumentation.js src/index.js"

After this single change, every Express request gets a trace with child spans for every database query, Redis operation, and outbound HTTP call. No application code changes.

Correlating Trace IDs with Pino Logs

Logs and traces are separate. Connect them by injecting the active trace ID into every log line:

javascript
import { trace, context } from '@opentelemetry/api'
import pino from 'pino'

const baseLogger = pino({ level: 'info' })

// Middleware that creates a request-scoped logger with trace context
app.use((req, res, next) => {
  const span = trace.getActiveSpan()
  const spanContext = span?.spanContext()
  
  req.log = baseLogger.child({
    traceId: spanContext?.traceId,
    spanId: spanContext?.spanId,
    requestId: req.headers['x-request-id'],
  })
  
  next()
})

// Use req.log instead of console.log in route handlers
app.get('/api/orders', async (req, res) => {
  req.log.info('Fetching orders')
  const orders = await getOrders()
  req.log.info({ count: orders.length }, 'Orders fetched')
  res.json(orders)
})

Now when you search for a trace ID in your log aggregator, you see every log line from that exact request alongside the spans.

Custom Spans for Business Logic

Auto-instrumentation covers I/O. For business logic timing (payment processing, report generation, complex calculations), add manual spans:

javascript
import { trace } from '@opentelemetry/api'

const tracer = trace.getTracer('order-service', '1.0.0')

async function processPayment(order) {
  return tracer.startActiveSpan('process-payment', async (span) => {
    span.setAttribute('order.id', order.id)
    span.setAttribute('order.amount_cents', order.amountCents)
    
    try {
      const result = await stripeClient.charges.create({ ... })
      span.setAttribute('stripe.charge_id', result.id)
      span.setStatus({ code: SpanStatusCode.OK })
      return result
    } catch (error) {
      span.recordException(error)
      span.setStatus({ code: SpanStatusCode.ERROR, message: error.message })
      throw error
    } finally {
      span.end()
    }
  })
}

The span appears in your trace waterfall between the database query that loaded the order and the Redis operation that caches the result. You can see exactly how long payment processing takes, independently of everything around it.

PreviousModule P-7: Connecting External Services and Caching Next Module P-9: WebSockets and Real-Time Communication