Structured logging with Pino, correlation IDs, multi-stage Docker builds, docker-compose for local dev, PM2 cluster mode, and GitHub Actions CI/CD.
Module P-8 — Logging, Observability Basics, and Deployment
What this module covers: console.log doesn't scale. When your API handles thousands of requests per second across multiple servers, you need structured logs you can query, correlation IDs that link a request across every log line, and a deployment pipeline that gets code to production without manual steps. This module covers structured logging with Pino, correlation IDs, multi-stage Docker builds that produce lean production images, docker-compose for local development, PM2 for running Node.js processes in production, and a GitHub Actions pipeline that tests, builds, and deploys on every push to main.
Why Structured Logging
console.log('User logged in: ' + userId) produces a string. You can't query it, you can't filter it, you can't aggregate it. Structured logging produces JSON:
json{ "level": "info", "time": "2024-05-22T14:32:01.123Z", "msg": "User logged in", "userId": 42, "requestId": "f8a2e9c1-4b5d-4f3a-8e6c-2d1a7b3f9e4c", "ip": "203.0.113.42", "durationMs": 183 }
Now you can run WHERE msg = 'User logged in' AND durationMs > 500 to find slow logins. You can group by userId to see all activity for a user. You can set up alerts on level = 'error'.
Pino: Fast Structured Logging
Pino is the fastest JSON logger for Node.js. It is designed to minimise work on the hot path — log entries are serialised asynchronously.
bashnpm install pino pino-pretty npm install -D @types/pino
Logger singleton
typescript// src/utils/logger.ts import pino from 'pino'; import { env } from '../config/env.js'; const logger = pino({ level: env.LOG_LEVEL ?? 'info', // In development: pretty-print for readability // In production: raw JSON for log aggregators (Datadog, Loki, CloudWatch) transport: env.NODE_ENV === 'development' ? { target: 'pino-pretty', options: { colorize: true, translateTime: 'HH:MM:ss.l', ignore: 'pid,hostname', }, } : undefined, // Redact sensitive fields from all log entries redact: { paths: ['req.headers.authorization', 'body.password', 'body.passwordHash'], censor: '[REDACTED]', }, // Base fields added to every log entry base: { env: env.NODE_ENV, version: process.env.npm_package_version, }, // Serialisers — control how objects are formatted serializers: { err: pino.stdSerializers.err, // properly serialise Error objects req: pino.stdSerializers.req, res: pino.stdSerializers.res, }, }); export default logger;
Using the logger
typescriptimport logger from '../utils/logger.js'; // Info — normal operations logger.info({ userId: 42, orderId: 7 }, 'Order created'); // Warn — unexpected but recoverable logger.warn({ attemptCount: 3, ip: req.ip }, 'Failed login attempt'); // Error — errors that need attention logger.error({ err, orderId: 7 }, 'Failed to send order confirmation email'); // Debug — verbose, off in production logger.debug({ query: sql, params }, 'Executing query'); // Child loggers — add context for a specific scope const requestLogger = logger.child({ requestId: req.requestId }); requestLogger.info('Processing payment'); requestLogger.info({ chargeId: 'ch_xxx' }, 'Payment succeeded'); // Both entries have requestId set automatically
HTTP Request Logging Middleware
Log every incoming request with timing:
typescript// src/middleware/httpLogger.ts import { Request, Response, NextFunction } from 'express'; import logger from '../utils/logger.js'; export function httpLogger(req: Request, res: Response, next: NextFunction) { const start = Date.now(); // Log when response finishes — not when request arrives res.on('finish', () => { const durationMs = Date.now() - start; const level = res.statusCode >= 500 ? 'error' : res.statusCode >= 400 ? 'warn' : 'info'; logger[level]({ requestId: req.requestId, method: req.method, url: req.originalUrl, statusCode: res.statusCode, durationMs, ip: req.ip, userAgent: req.headers['user-agent'], userId: req.user?.id, }, 'HTTP request'); }); next(); }
typescript// src/app.ts app.use(requestId); // set req.requestId first app.use(httpLogger); // then log with the ID
Development output (pino-pretty):
14:32:01.123 INFO HTTP request
requestId: "f8a2e9c1"
method: "POST"
url: "/orders"
statusCode: 201
durationMs: 183
userId: 42
Correlation IDs Across Services
When a request triggers calls to multiple services (or microservices), correlation IDs let you trace the full journey in logs:
typescript// src/middleware/requestId.ts import { randomUUID } from 'crypto'; export function requestId(req, res, next) { // Accept correlation ID from upstream (API gateway, frontend) or generate one req.requestId = req.headers['x-request-id'] as string ?? randomUUID(); res.setHeader('x-request-id', req.requestId); next(); }
Pass the correlation ID when making outbound service calls:
typescriptexport async function callPaymentService(orderId: number, requestId: string) { const response = await fetch(`${env.PAYMENT_SERVICE_URL}/charges`, { method: 'POST', headers: { 'Content-Type': 'application/json', 'X-Request-ID': requestId, // propagate downstream }, body: JSON.stringify({ orderId }), }); return response.json(); }
Now grep requestId=f8a2e9c1 in your log aggregator shows every log line from every service for a single user request.
Multi-Stage Docker Build
A naive Docker build copies node_modules including all dev dependencies, making the image hundreds of megabytes unnecessarily. Multi-stage builds fix this.
dockerfile# Dockerfile # ─── Stage 1: Build ────────────────────────────────────────────────────────── FROM node:22-alpine AS builder WORKDIR /app # Install ALL dependencies (including devDependencies for TypeScript) COPY package*.json ./ RUN npm ci # Copy source and compile COPY tsconfig.json ./ COPY src ./src RUN npm run build # outputs to /app/dist # ─── Stage 2: Production image ─────────────────────────────────────────────── FROM node:22-alpine AS production WORKDIR /app # Create non-root user — never run Node.js as root RUN addgroup -g 1001 nodejs && \ adduser -S -u 1001 -G nodejs nodeuser # Install only production dependencies COPY package*.json ./ RUN npm ci --omit=dev && npm cache clean --force # Copy compiled output from builder stage — no TypeScript source COPY /app/dist ./dist # Set ownership RUN chown -R nodeuser:nodejs /app USER nodeuser # Document the port (doesn't actually publish it) EXPOSE 3000 # Use exec form — PID 1 gets SIGTERM, enables graceful shutdown CMD ["node", "dist/index.js"]
Build and run:
bashdocker build -t myapp:latest . docker run -p 3000:3000 --env-file .env myapp:latest
Image size comparison:
- Without multi-stage: ~800MB (with TypeScript, ts-node, test deps)
- With multi-stage: ~150MB (only compiled JS + production deps)
Graceful shutdown in Node.js
typescript// src/index.ts import app from './app.js'; import prisma from './db/prisma.js'; import redis from './db/redis.js'; import logger from './utils/logger.js'; import { env } from './config/env.js'; const server = app.listen(env.PORT, () => { logger.info({ port: env.PORT }, 'Server started'); }); async function shutdown(signal: string) { logger.info({ signal }, 'Shutdown signal received'); // Stop accepting new connections server.close(async () => { try { // Close database connections cleanly await prisma.$disconnect(); redis.disconnect(); logger.info('Graceful shutdown complete'); process.exit(0); } catch (err) { logger.error({ err }, 'Error during shutdown'); process.exit(1); } }); // Force exit if graceful shutdown hangs setTimeout(() => { logger.error('Forced shutdown after timeout'); process.exit(1); }, 10_000); } process.on('SIGTERM', () => shutdown('SIGTERM')); process.on('SIGINT', () => shutdown('SIGINT')); process.on('uncaughtException', (err) => { logger.fatal({ err }, 'Uncaught exception'); process.exit(1); }); process.on('unhandledRejection', (reason) => { logger.fatal({ reason }, 'Unhandled promise rejection'); process.exit(1); });
docker-compose for Local Development
docker-compose runs all your local dependencies (Postgres, Redis) with one command:
yaml# docker-compose.yml version: '3.9' services: app: build: context: . target: builder # use the builder stage — includes ts-node for dev ports: - '3000:3000' environment: NODE_ENV: development DATABASE_URL: postgresql://postgres:postgres@postgres:5432/myapp_dev REDIS_URL: redis://redis:6379 JWT_ACCESS_SECRET: local-dev-secret-32-characters-minimum JWT_REFRESH_SECRET: local-dev-refresh-32-characters-min volumes: - ./src:/app/src # hot reload — changes reflected instantly command: npm run dev depends_on: postgres: condition: service_healthy redis: condition: service_healthy postgres: image: postgres:16-alpine environment: POSTGRES_DB: myapp_dev POSTGRES_USER: postgres POSTGRES_PASSWORD: postgres ports: - '5432:5432' volumes: - postgres_data:/var/lib/postgresql/data healthcheck: test: ['CMD-SHELL', 'pg_isready -U postgres'] interval: 5s timeout: 5s retries: 5 redis: image: redis:7-alpine ports: - '6379:6379' healthcheck: test: ['CMD', 'redis-cli', 'ping'] interval: 5s timeout: 5s retries: 5 volumes: postgres_data:
bashdocker-compose up -d # start in background docker-compose logs -f app # tail logs docker-compose down # stop docker-compose down -v # stop and wipe data
PM2 for Production Process Management
When deploying to a bare VM or VPS (rather than a container orchestrator), PM2 manages your Node.js process:
bashnpm install -g pm2
javascript// ecosystem.config.js module.exports = { apps: [ { name: 'myapp', script: 'dist/index.js', instances: 'max', // one process per CPU core exec_mode: 'cluster', // cluster mode — all processes share port 3000 max_memory_restart: '500M', env_production: { NODE_ENV: 'production', PORT: 3000, }, // Logging out_file: '/var/log/myapp/out.log', error_file: '/var/log/myapp/error.log', merge_logs: true, log_date_format: 'YYYY-MM-DD HH:mm:ss', }, ], };
bashpm2 start ecosystem.config.js --env production pm2 save # persist config — restart on system reboot pm2 startup # generate startup script for your OS pm2 status # check running processes pm2 logs myapp # tail logs pm2 reload myapp # zero-downtime reload (cluster mode) pm2 monit # real-time CPU/memory dashboard
Cluster mode vs worker threads: PM2 cluster mode forks multiple Node.js processes, each with their own event loop. This saturates all CPU cores. For CPU-bound work within a single request, worker threads (covered in the Architect phase) are the right tool. For most APIs (I/O bound), PM2 cluster mode is sufficient.
GitHub Actions CI/CD Pipeline
A pipeline that runs on every push: type-check → test → build Docker image → push to registry → deploy:
yaml# .github/workflows/deploy.yml name: Test, Build, Deploy on: push: branches: [main] pull_request: branches: [main] env: REGISTRY: ghcr.io IMAGE_NAME: ${{ github.repository }} jobs: test: runs-on: ubuntu-latest services: postgres: image: postgres:16 env: POSTGRES_DB: myapp_test POSTGRES_USER: postgres POSTGRES_PASSWORD: postgres ports: ['5432:5432'] options: --health-cmd pg_isready --health-interval 10s redis: image: redis:7 ports: ['6379:6379'] options: --health-cmd "redis-cli ping" --health-interval 10s steps: - uses: actions/checkout@v4 - uses: actions/setup-node@v4 with: node-version: '22' cache: 'npm' - run: npm ci - name: Type check run: npx tsc --noEmit - name: Run tests run: npm run test:ci env: DATABASE_URL: postgresql://postgres:postgres@localhost:5432/myapp_test REDIS_URL: redis://localhost:6379 JWT_ACCESS_SECRET: ci-test-secret-at-least-32-characters JWT_REFRESH_SECRET: ci-refresh-secret-at-least-32-characters NODE_ENV: test build-and-push: needs: test runs-on: ubuntu-latest if: github.ref == 'refs/heads/main' # only on main branch pushes permissions: contents: read packages: write outputs: image-tag: ${{ steps.meta.outputs.tags }} steps: - uses: actions/checkout@v4 - name: Log in to container registry uses: docker/login-action@v3 with: registry: ${{ env.REGISTRY }} username: ${{ github.actor }} password: ${{ secrets.GITHUB_TOKEN }} - name: Extract metadata id: meta uses: docker/metadata-action@v5 with: images: ${{ env.REGISTRY }}/${{ env.IMAGE_NAME }} tags: | type=sha,prefix=sha- type=raw,value=latest,enable=true - name: Build and push Docker image uses: docker/build-push-action@v5 with: context: . target: production push: true tags: ${{ steps.meta.outputs.tags }} cache-from: type=gha cache-to: type=gha,mode=max deploy: needs: build-and-push runs-on: ubuntu-latest if: github.ref == 'refs/heads/main' steps: - name: Deploy to production uses: appleboy/ssh-action@v1 with: host: ${{ secrets.DEPLOY_HOST }} username: ${{ secrets.DEPLOY_USER }} key: ${{ secrets.DEPLOY_SSH_KEY }} script: | docker pull ghcr.io/${{ github.repository }}:latest docker stop myapp || true docker rm myapp || true docker run -d \ --name myapp \ --restart unless-stopped \ -p 3000:3000 \ --env-file /etc/myapp/.env \ ghcr.io/${{ github.repository }}:latest
Repository secrets to configure
In GitHub → Settings → Secrets and variables → Actions:
DEPLOY_HOST— your server IP or hostnameDEPLOY_USER— SSH username (ubuntu,deploy, etc.)DEPLOY_SSH_KEY— private SSH key (the public key must be in~/.ssh/authorized_keyson the server)
.dockerignore
Prevent unnecessary files from being copied into the build context — this speeds up builds and prevents secrets from leaking into images:
node_modules
dist
.git
.env
.env.*
*.log
*.test.ts
__tests__
coverage
.github
README.md
docker-compose.yml
Summary
- Pino produces JSON logs with near-zero overhead. Never use
console.login production code. - Child loggers (
logger.child({ requestId })) attach context to every log line in a scope without repetition. - Correlation IDs set on every request, propagated to outbound calls, and included in every log — make distributed debugging possible.
- Multi-stage Docker builds separate the TypeScript compile step from the production image. Result: ~150MB images instead of ~800MB.
- Graceful shutdown — handle
SIGTERM, close the server, drain connections, disconnect from databases. Required for zero-downtime deployments. - docker-compose — one command starts Postgres, Redis, and your app for local development with hot reload.
- PM2 cluster mode — saturates all CPU cores with independent Node.js processes. Zero-downtime reload with
pm2 reload. - GitHub Actions — test → build Docker image → push to registry → SSH deploy on every push to main. Gate the deploy step behind test success.
Next: WebSockets and real-time communication — the upgrade from HTTP to WebSocket, broadcasting events to connected clients, presence tracking, and scaling WebSockets across multiple servers with Redis Pub/Sub.
OpenTelemetry — Distributed Traces From Minute One
Pino logs with correlation IDs are necessary. They're not sufficient. When a request to your Express API takes 800ms and you don't know if the time was spent in a database query, an external HTTP call, or application logic, you're debugging blind. Distributed traces tell you exactly where the time went.
OpenTelemetry (OTel) is the industry-standard observability framework — vendor-neutral, supported by every major APM tool (Datadog, New Relic, Jaeger, Grafana Tempo, Honeycomb). Set it up once, export to any backend.
Auto-Instrumentation — Zero Code Changes
The @opentelemetry/auto-instrumentations-node package automatically instruments:
- Express (request spans, route attributes)
- node-postgres / pg (query spans with SQL text)
- ioredis (command spans)
- http/https (outbound request spans)
- dns (resolution spans)
- fs (file operation spans — optional, noisy)
bashnpm install @opentelemetry/sdk-node \ @opentelemetry/auto-instrumentations-node \ @opentelemetry/exporter-trace-otlp-http
javascript// instrumentation.js — must be loaded BEFORE anything else import { NodeSDK } from '@opentelemetry/sdk-node' import { getNodeAutoInstrumentations } from '@opentelemetry/auto-instrumentations-node' import { OTLPTraceExporter } from '@opentelemetry/exporter-trace-otlp-http' const sdk = new NodeSDK({ serviceName: process.env.SERVICE_NAME ?? 'api-server', traceExporter: new OTLPTraceExporter({ url: process.env.OTEL_EXPORTER_OTLP_ENDPOINT ?? 'http://localhost:4318/v1/traces', }), instrumentations: [ getNodeAutoInstrumentations({ '@opentelemetry/instrumentation-fs': { enabled: false }, // too noisy '@opentelemetry/instrumentation-pg': { enhancedDatabaseReporting: true, // includes SQL in span attributes }, }), ], }) sdk.start() // Graceful shutdown — flush pending spans before process exits process.on('SIGTERM', () => sdk.shutdown())
Start your application with the instrumentation file loaded:
bashnode --import ./instrumentation.js src/index.js # or in package.json scripts: "start": "node --import ./instrumentation.js src/index.js"
After this single change, every Express request gets a trace with child spans for every database query, Redis operation, and outbound HTTP call. No application code changes.
Correlating Trace IDs with Pino Logs
Logs and traces are separate. Connect them by injecting the active trace ID into every log line:
javascriptimport { trace, context } from '@opentelemetry/api' import pino from 'pino' const baseLogger = pino({ level: 'info' }) // Middleware that creates a request-scoped logger with trace context app.use((req, res, next) => { const span = trace.getActiveSpan() const spanContext = span?.spanContext() req.log = baseLogger.child({ traceId: spanContext?.traceId, spanId: spanContext?.spanId, requestId: req.headers['x-request-id'], }) next() }) // Use req.log instead of console.log in route handlers app.get('/api/orders', async (req, res) => { req.log.info('Fetching orders') const orders = await getOrders() req.log.info({ count: orders.length }, 'Orders fetched') res.json(orders) })
Now when you search for a trace ID in your log aggregator, you see every log line from that exact request alongside the spans.
Custom Spans for Business Logic
Auto-instrumentation covers I/O. For business logic timing (payment processing, report generation, complex calculations), add manual spans:
javascriptimport { trace } from '@opentelemetry/api' const tracer = trace.getTracer('order-service', '1.0.0') async function processPayment(order) { return tracer.startActiveSpan('process-payment', async (span) => { span.setAttribute('order.id', order.id) span.setAttribute('order.amount_cents', order.amountCents) try { const result = await stripeClient.charges.create({ ... }) span.setAttribute('stripe.charge_id', result.id) span.setStatus({ code: SpanStatusCode.OK }) return result } catch (error) { span.recordException(error) span.setStatus({ code: SpanStatusCode.ERROR, message: error.message }) throw error } finally { span.end() } }) }
The span appears in your trace waterfall between the database query that loaded the order and the Redis operation that caches the result. You can see exactly how long payment processing takes, independently of everything around it.