Module P-5·20 min read

INCR/INCRBY as lock-free counters, fixed-window rate limiter with INCR+EXPIRE, sliding window using Sorted Sets with timestamp scores, token bucket algorithm, and the off-by-one race in fixed-window that Lua eliminates.

P-5 — Atomic Counters, Rate Limiters, and Sliding Windows

Who this module is for: You need to limit how often a user can call your API, track how many times an event has occurred, or implement a quota system. The naive approach — read the counter, check it, increment it — has a race condition. This module covers lock-free atomic counters, the three rate limiting algorithms, and how to choose between them.

The Race Condition in Non-Atomic Counters

Before Redis, a typical rate limiter looked like this in pseudocode:

javascript
const count = await db.query('SELECT count FROM rate_limits WHERE user_id = $1', [userId]);
if (count >= 100) return rateLimitError();
await db.query('UPDATE rate_limits SET count = count + 1 WHERE user_id = $1', [userId]);

Under concurrent requests, two requests can both read count = 99, both pass the check, and both increment — resulting in count = 101, exceeding the limit. This is a classic TOCTOU (Time Of Check, Time Of Use) race.

Redis's INCR command is atomic. The read and increment happen in a single operation that cannot be interleaved with another client's commands.

Atomic Counters with INCR

INCR key             → atomically increment by 1, return new value
INCRBY key n         → atomically increment by n
DECR key             → atomically decrement by 1
DECRBY key n         → atomically decrement by n
INCRBYFLOAT key f    → atomically increment by float f

127.0.0.1:6379> SET page:views:homepage 0
OK
127.0.0.1:6379> INCR page:views:homepage
(integer) 1
127.0.0.1:6379> INCR page:views:homepage
(integer) 2
127.0.0.1:6379> INCRBY page:views:homepage 50
(integer) 52

If the key does not exist, INCR creates it with value 0 and increments to 1. This makes initialization implicit — no SET ... 0 required.

Use case: Event counters, page view tracking, API call totals. Safe under any level of concurrency.

Rate Limiter Pattern 1: Fixed Window

The simplest rate limiter: count requests in a fixed time window (per minute, per hour). Allow up to limit requests per window.

typescript
async function isAllowed(userId: string, limit: number, windowSeconds: number): Promise<boolean> {
  const key = `rate:${userId}:${Math.floor(Date.now() / 1000 / windowSeconds)}`;

  const count = await redis.incr(key);

  if (count === 1) {
    // First request in this window — set expiry
    await redis.expire(key, windowSeconds);
  }

  return count <= limit;
}

The key includes the window index (Math.floor(Date.now() / windowSeconds)) — a new key is automatically created for each window.

The INCR + EXPIRE race: If the process crashes between INCR and EXPIRE, the key has no TTL and persists indefinitely. Fix with SET ... NX EX:

typescript
async function isAllowed(userId: string, limit: number, windowSeconds: number): Promise<boolean> {
  const key = `rate:${userId}:${Math.floor(Date.now() / 1000 / windowSeconds)}`;

  const pipeline = redis.pipeline();
  pipeline.incr(key);
  pipeline.expire(key, windowSeconds, 'NX');  // NX: only set if no TTL exists
  const [[, count]] = await pipeline.exec() as [[null, number], [null, number]];

  return count <= limit;
}

EXPIRE key seconds NX (Redis 7.0+) sets the TTL only if the key does not already have one — safe for the first request, no-op for subsequent ones in the same window.

The boundary problem: At the boundary between windows (e.g., at exactly 12:00:00), a user can make limit requests at 11:59:59 and limit more requests at 12:00:00 — effectively 2×limit requests in 2 seconds. This is the fixed window's fundamental flaw.

Rate Limiter Pattern 2: Sliding Window (Log-Based)

Record a timestamp for every request in a Sorted Set, using the timestamp as the score. The window is the last windowSeconds seconds of timestamps.

typescript
async function isAllowed(userId: string, limit: number, windowMs: number): Promise<boolean> {
  const key = `rate:log:${userId}`;
  const now = Date.now();
  const windowStart = now - windowMs;

  const pipeline = redis.pipeline();
  // Remove requests older than the window
  pipeline.zremrangebyscore(key, 0, windowStart);
  // Add this request
  pipeline.zadd(key, now, `${now}-${Math.random()}`);  // unique member
  // Count requests in the window
  pipeline.zcard(key);
  // Reset TTL
  pipeline.expire(key, Math.ceil(windowMs / 1000) + 1);

  const results = await pipeline.exec() as [null, unknown][];
  const count = results[2][1] as number;

  return count <= limit;
}

Accuracy: The sliding window is exact — it counts precisely the number of requests in the last windowMs milliseconds. No boundary problem.

Memory cost: Each request adds one member to the Sorted Set. At 100 requests/minute/user with a 1-minute window, each user's key holds up to 100 members. At 1 million active users, this is 100 million Sorted Set members — significant but manageable.

Optimization: Instead of storing one member per request, you can use per-second (or per-minute) buckets:

typescript
// Store count per second bucket instead of per request
async function isAllowedBucket(userId: string, limit: number, windowSeconds: number): Promise<boolean> {
  const key = `rate:bucket:${userId}`;
  const now = Math.floor(Date.now() / 1000);  // current second
  const windowStart = now - windowSeconds;

  const pipeline = redis.pipeline();
  pipeline.zremrangebyscore(key, 0, windowStart);
  pipeline.zincrby(key, 1, String(now));  // increment count for current second
  pipeline.zrangebyscore(key, windowStart + 1, '+inf', 'WITHSCORES');
  pipeline.expire(key, windowSeconds + 1);

  const results = await pipeline.exec() as [null, unknown][];
  const entries = results[2][1] as string[];

  let total = 0;
  for (let i = 1; i < entries.length; i += 2) {
    total += parseInt(entries[i], 10);
  }

  return total <= limit;
}

This stores one Sorted Set member per active second instead of per request — up to windowSeconds members regardless of traffic.

Rate Limiter Pattern 3: Token Bucket

The token bucket is the most flexible algorithm. A bucket holds up to capacity tokens. Tokens replenish at refillRate tokens per second. Each request consumes one token. If the bucket is empty, the request is rejected.

This allows short bursts (up to capacity requests instantly) while enforcing a long-term average rate (refillRate per second).

The clean implementation requires reading the last refill time and current token count, computing how many tokens have accumulated, and updating both — a read-modify-write that is a natural fit for Lua scripts:

lua
-- KEYS[1] = token bucket key
-- ARGV[1] = capacity (max tokens)
-- ARGV[2] = refill rate (tokens per second)
-- ARGV[3] = current time (Unix ms)
-- ARGV[4] = tokens requested (usually 1)

local key = KEYS[1]
local capacity = tonumber(ARGV[1])
local rate = tonumber(ARGV[2])
local now = tonumber(ARGV[3])
local requested = tonumber(ARGV[4])

local data = redis.call('HMGET', key, 'tokens', 'last_refill')
local tokens = tonumber(data[1]) or capacity
local last_refill = tonumber(data[2]) or now

-- Calculate tokens earned since last refill
local elapsed_ms = now - last_refill
local new_tokens = (elapsed_ms / 1000) * rate
tokens = math.min(capacity, tokens + new_tokens)

if tokens >= requested then
  tokens = tokens - requested
  redis.call('HMSET', key, 'tokens', tokens, 'last_refill', now)
  redis.call('PEXPIRE', key, math.ceil(capacity / rate * 1000) + 1000)
  return 1  -- allowed
else
  redis.call('HMSET', key, 'tokens', tokens, 'last_refill', now)
  redis.call('PEXPIRE', key, math.ceil(capacity / rate * 1000) + 1000)
  return 0  -- rejected
end

typescript
const tokenBucketScript = `...lua script above...`;

// Load the script once
const sha = await redis.script('LOAD', tokenBucketScript);

// Use on every request
async function isAllowed(userId: string): Promise<boolean> {
  const result = await redis.evalsha(
    sha, 1,
    `rate:bucket:${userId}`,
    '100',           // capacity: 100 tokens
    '10',            // refill rate: 10 tokens/sec
    String(Date.now()),
    '1'              // consume 1 token
  );
  return result === 1;
}

Comparing the Three Algorithms

Algorithm	Memory per user	Boundary-safe?	Burst handling	Complexity
Fixed window	O(1) — one String key	No	Full window burst	Simplest
Sliding window log	O(requests in window)	Yes	Precise	Moderate
Token bucket	O(1) — one Hash	Yes	Configurable burst	Lua required

Choose fixed window when:

Simplicity is paramount
Boundary bursts are acceptable (API consumers are unlikely to exploit them)
Very high throughput (the O(1) memory and single INCR command is fastest)

Choose sliding window when:

Accuracy is critical (financial transactions, security-sensitive endpoints)
Request rate per user is moderate (not millions/sec)

Choose token bucket when:

You need burst allowance separate from long-term rate
Consumers legitimately need to make many requests at once occasionally

Distributed Rate Limiting (Multiple App Servers)

All three algorithms work naturally across multiple application servers because the state lives in Redis, not in-process memory. Any server can increment the counter or consume from the bucket.

Caution with multiple Redis nodes (Cluster): The rate limit key for a user must always hit the same Redis node. Use hash tags to ensure this:

rate:{user:1001}:window  → hash tag {user:1001} forces same slot

Without hash tags, a MULTI-key operation (pipeline with ZREMRANGEBYSCORE + ZADD + ZCARD) would fail in Cluster if the keys land on different slots.

Summary

INCR is atomic — read and increment happen together, no race condition possible
INCR + EXPIRE NX is the correct fixed-window rate limiter: creates key and sets TTL atomically
Fixed window: simplest, O(1) memory, but allows 2× limit at window boundaries
Sliding window: exact, O(requests-in-window) memory, no boundary problem
Token bucket: allows configured bursts, long-term rate enforcement, requires Lua for atomicity
In Redis Cluster: use hash tags on rate limit keys to ensure all operations hit the same shard
Load Lua scripts with SCRIPT LOAD + EVALSHA to avoid retransmitting the script on every call

Next: P-6 — BullMQ Internals: The Redis Data Structures Behind the Job Queue — how BullMQ maps job lifecycle to Sorted Sets, Lists, and Hashes, and what knowing the internals tells you about tuning your queue.

PreviousModule P-4: Memory Profiling and Optimization Next Module P-6: BullMQ Internals: The Redis Data Structures Behind the Job Queue