INCR/INCRBY as lock-free counters, fixed-window rate limiter with INCR+EXPIRE, sliding window using Sorted Sets with timestamp scores, token bucket algorithm, and the off-by-one race in fixed-window that Lua eliminates.
P-5 — Atomic Counters, Rate Limiters, and Sliding Windows
Who this module is for: You need to limit how often a user can call your API, track how many times an event has occurred, or implement a quota system. The naive approach — read the counter, check it, increment it — has a race condition. This module covers lock-free atomic counters, the three rate limiting algorithms, and how to choose between them.
The Race Condition in Non-Atomic Counters
Before Redis, a typical rate limiter looked like this in pseudocode:
javascriptconst count = await db.query('SELECT count FROM rate_limits WHERE user_id = $1', [userId]); if (count >= 100) return rateLimitError(); await db.query('UPDATE rate_limits SET count = count + 1 WHERE user_id = $1', [userId]);
Under concurrent requests, two requests can both read count = 99, both pass the check, and both increment — resulting in count = 101, exceeding the limit. This is a classic TOCTOU (Time Of Check, Time Of Use) race.
Redis's INCR command is atomic. The read and increment happen in a single operation that cannot be interleaved with another client's commands.
Atomic Counters with INCR
INCR key → atomically increment by 1, return new value
INCRBY key n → atomically increment by n
DECR key → atomically decrement by 1
DECRBY key n → atomically decrement by n
INCRBYFLOAT key f → atomically increment by float f
127.0.0.1:6379> SET page:views:homepage 0
OK
127.0.0.1:6379> INCR page:views:homepage
(integer) 1
127.0.0.1:6379> INCR page:views:homepage
(integer) 2
127.0.0.1:6379> INCRBY page:views:homepage 50
(integer) 52
If the key does not exist, INCR creates it with value 0 and increments to 1. This makes initialization implicit — no SET ... 0 required.
Use case: Event counters, page view tracking, API call totals. Safe under any level of concurrency.
Rate Limiter Pattern 1: Fixed Window
The simplest rate limiter: count requests in a fixed time window (per minute, per hour). Allow up to limit requests per window.
typescriptasync function isAllowed(userId: string, limit: number, windowSeconds: number): Promise<boolean> { const key = `rate:${userId}:${Math.floor(Date.now() / 1000 / windowSeconds)}`; const count = await redis.incr(key); if (count === 1) { // First request in this window — set expiry await redis.expire(key, windowSeconds); } return count <= limit; }
The key includes the window index (Math.floor(Date.now() / windowSeconds)) — a new key is automatically created for each window.
The INCR + EXPIRE race: If the process crashes between INCR and EXPIRE, the key has no TTL and persists indefinitely. Fix with SET ... NX EX:
typescriptasync function isAllowed(userId: string, limit: number, windowSeconds: number): Promise<boolean> { const key = `rate:${userId}:${Math.floor(Date.now() / 1000 / windowSeconds)}`; const pipeline = redis.pipeline(); pipeline.incr(key); pipeline.expire(key, windowSeconds, 'NX'); // NX: only set if no TTL exists const [[, count]] = await pipeline.exec() as [[null, number], [null, number]]; return count <= limit; }
EXPIRE key seconds NX (Redis 7.0+) sets the TTL only if the key does not already have one — safe for the first request, no-op for subsequent ones in the same window.
The boundary problem: At the boundary between windows (e.g., at exactly 12:00:00), a user can make limit requests at 11:59:59 and limit more requests at 12:00:00 — effectively 2×limit requests in 2 seconds. This is the fixed window's fundamental flaw.
Rate Limiter Pattern 2: Sliding Window (Log-Based)
Record a timestamp for every request in a Sorted Set, using the timestamp as the score. The window is the last windowSeconds seconds of timestamps.
typescriptasync function isAllowed(userId: string, limit: number, windowMs: number): Promise<boolean> { const key = `rate:log:${userId}`; const now = Date.now(); const windowStart = now - windowMs; const pipeline = redis.pipeline(); // Remove requests older than the window pipeline.zremrangebyscore(key, 0, windowStart); // Add this request pipeline.zadd(key, now, `${now}-${Math.random()}`); // unique member // Count requests in the window pipeline.zcard(key); // Reset TTL pipeline.expire(key, Math.ceil(windowMs / 1000) + 1); const results = await pipeline.exec() as [null, unknown][]; const count = results[2][1] as number; return count <= limit; }
Accuracy: The sliding window is exact — it counts precisely the number of requests in the last windowMs milliseconds. No boundary problem.
Memory cost: Each request adds one member to the Sorted Set. At 100 requests/minute/user with a 1-minute window, each user's key holds up to 100 members. At 1 million active users, this is 100 million Sorted Set members — significant but manageable.
Optimization: Instead of storing one member per request, you can use per-second (or per-minute) buckets:
typescript// Store count per second bucket instead of per request async function isAllowedBucket(userId: string, limit: number, windowSeconds: number): Promise<boolean> { const key = `rate:bucket:${userId}`; const now = Math.floor(Date.now() / 1000); // current second const windowStart = now - windowSeconds; const pipeline = redis.pipeline(); pipeline.zremrangebyscore(key, 0, windowStart); pipeline.zincrby(key, 1, String(now)); // increment count for current second pipeline.zrangebyscore(key, windowStart + 1, '+inf', 'WITHSCORES'); pipeline.expire(key, windowSeconds + 1); const results = await pipeline.exec() as [null, unknown][]; const entries = results[2][1] as string[]; let total = 0; for (let i = 1; i < entries.length; i += 2) { total += parseInt(entries[i], 10); } return total <= limit; }
This stores one Sorted Set member per active second instead of per request — up to windowSeconds members regardless of traffic.
Rate Limiter Pattern 3: Token Bucket
The token bucket is the most flexible algorithm. A bucket holds up to capacity tokens. Tokens replenish at refillRate tokens per second. Each request consumes one token. If the bucket is empty, the request is rejected.
This allows short bursts (up to capacity requests instantly) while enforcing a long-term average rate (refillRate per second).
The clean implementation requires reading the last refill time and current token count, computing how many tokens have accumulated, and updating both — a read-modify-write that is a natural fit for Lua scripts:
lua-- KEYS[1] = token bucket key -- ARGV[1] = capacity (max tokens) -- ARGV[2] = refill rate (tokens per second) -- ARGV[3] = current time (Unix ms) -- ARGV[4] = tokens requested (usually 1) local key = KEYS[1] local capacity = tonumber(ARGV[1]) local rate = tonumber(ARGV[2]) local now = tonumber(ARGV[3]) local requested = tonumber(ARGV[4]) local data = redis.call('HMGET', key, 'tokens', 'last_refill') local tokens = tonumber(data[1]) or capacity local last_refill = tonumber(data[2]) or now -- Calculate tokens earned since last refill local elapsed_ms = now - last_refill local new_tokens = (elapsed_ms / 1000) * rate tokens = math.min(capacity, tokens + new_tokens) if tokens >= requested then tokens = tokens - requested redis.call('HMSET', key, 'tokens', tokens, 'last_refill', now) redis.call('PEXPIRE', key, math.ceil(capacity / rate * 1000) + 1000) return 1 -- allowed else redis.call('HMSET', key, 'tokens', tokens, 'last_refill', now) redis.call('PEXPIRE', key, math.ceil(capacity / rate * 1000) + 1000) return 0 -- rejected end
typescriptconst tokenBucketScript = `...lua script above...`; // Load the script once const sha = await redis.script('LOAD', tokenBucketScript); // Use on every request async function isAllowed(userId: string): Promise<boolean> { const result = await redis.evalsha( sha, 1, `rate:bucket:${userId}`, '100', // capacity: 100 tokens '10', // refill rate: 10 tokens/sec String(Date.now()), '1' // consume 1 token ); return result === 1; }
Comparing the Three Algorithms
| Algorithm | Memory per user | Boundary-safe? | Burst handling | Complexity |
|---|---|---|---|---|
| Fixed window | O(1) — one String key | No | Full window burst | Simplest |
| Sliding window log | O(requests in window) | Yes | Precise | Moderate |
| Token bucket | O(1) — one Hash | Yes | Configurable burst | Lua required |
Choose fixed window when:
- Simplicity is paramount
- Boundary bursts are acceptable (API consumers are unlikely to exploit them)
- Very high throughput (the O(1) memory and single INCR command is fastest)
Choose sliding window when:
- Accuracy is critical (financial transactions, security-sensitive endpoints)
- Request rate per user is moderate (not millions/sec)
Choose token bucket when:
- You need burst allowance separate from long-term rate
- Consumers legitimately need to make many requests at once occasionally
Distributed Rate Limiting (Multiple App Servers)
All three algorithms work naturally across multiple application servers because the state lives in Redis, not in-process memory. Any server can increment the counter or consume from the bucket.
Caution with multiple Redis nodes (Cluster): The rate limit key for a user must always hit the same Redis node. Use hash tags to ensure this:
rate:{user:1001}:window → hash tag {user:1001} forces same slot
Without hash tags, a MULTI-key operation (pipeline with ZREMRANGEBYSCORE + ZADD + ZCARD) would fail in Cluster if the keys land on different slots.
Summary
INCRis atomic — read and increment happen together, no race condition possibleINCR+EXPIRE NXis the correct fixed-window rate limiter: creates key and sets TTL atomically- Fixed window: simplest, O(1) memory, but allows 2× limit at window boundaries
- Sliding window: exact, O(requests-in-window) memory, no boundary problem
- Token bucket: allows configured bursts, long-term rate enforcement, requires Lua for atomicity
- In Redis Cluster: use hash tags on rate limit keys to ensure all operations hit the same shard
- Load Lua scripts with
SCRIPT LOAD+EVALSHAto avoid retransmitting the script on every call
Next: P-6 — BullMQ Internals: The Redis Data Structures Behind the Job Queue — how BullMQ maps job lifecycle to Sorted Sets, Lists, and Hashes, and what knowing the internals tells you about tuning your queue.