Module A-12·20 min read

Active-active (CRDT-based) vs active-passive multi-region Redis, Redis Enterprise geo-replication, conflict resolution strategies, latency-vs-consistency trade-offs, and when global distribution is the wrong answer.

A-12 — Multi-Region Redis: Active-Active and Geo-Replication

Who this module is for: You serve users across multiple geographic regions and need Redis to be close to each user — low-latency reads and writes from any region. This module covers the two multi-region Redis architectures (active-passive and active-active), the CRDT-based conflict resolution that enables active-active, and when global Redis distribution creates more problems than it solves.


Why Multi-Region Redis

A Redis instance in us-east-1 adds 120ms+ of round-trip time for a user in ap-southeast-1 — completely negating Redis's sub-millisecond advantage over a database. For globally distributed applications, you need Redis deployed close to each user group.

Two architectural options:

Active-Passive:  One region writes, other regions read (with replication lag)
Active-Active:   All regions write; changes propagate and conflicts are resolved

Active-Passive: Regional Read Replicas

The simplest multi-region setup: the primary Redis is in one region; replicas in other regions serve reads with some lag.

us-east-1 (Primary): accepts all writes
      │
      │ async replication (100–200ms cross-region latency)
      ▼
eu-west-1 (Replica): serves reads (100–200ms behind primary)
ap-southeast-1 (Replica): serves reads (150–250ms behind primary)

Read path: Applications in eu-west-1 read from the local replica with < 1ms latency.

Write path: All writes go to us-east-1. An API call from ap-southeast-1 that writes data adds 150–250ms of round-trip time to reach the primary. Unacceptable for write-heavy workloads.

Failover: If the primary fails, you manually promote a replica using REPLICAOF NO ONE and update your application's connection strings. Sentinel can automate this within a single region but not across regions (network latency makes cross-region failure detection unreliable).

When active-passive is appropriate:

  • Read-heavy workloads where most data is read many times and written rarely
  • Data that is written in one primary region (user-generated content flow is unidirectional)
  • Cache workloads where cross-region write latency is acceptable

Active-Active: Every Region Writes Locally

In active-active mode, each regional Redis instance accepts writes. Changes propagate to other regions asynchronously. When two regions write to the same key simultaneously, conflict resolution determines the outcome.

us-east-1 (Primary 1): accepts writes from US users
eu-west-1 (Primary 2): accepts writes from EU users
ap-southeast-1 (Primary 3): accepts writes from APAC users
          │           │           │
          └───────────┴───────────┘
              bidirectional replication
              (100–300ms cross-region)

Write path: Each region writes locally with < 1ms latency. The write propagates to other regions asynchronously.

Conflict scenario:

T=0ms:  US user sets key "product:999:stock" to 5
T=0ms:  EU user simultaneously sets key "product:999:stock" to 3
T=200ms: US change arrives in EU, EU change arrives in US
→ What is the final value? 5? 3? Something else?

CRDTs: Conflict-Free Replicated Data Types

CRDTs (Conflict-Free Replicated Data Types) are data structures designed so that concurrent operations from different replicas can be merged without conflicts — regardless of order.

Redis Enterprise Geo-Distribution (and Redis Cloud) implements CRDTs for Redis data types:

Redis TypeCRDT Semantics
StringLast-Write-Wins (LWW) based on logical timestamp
Counter (INCR)All increments are summed across regions
SetAll SETs are merged; DELs use timestamps
Sorted SetAll ZADDs are merged; conflicts resolved by LWW
HashField-level LWW — different fields can come from different regions
ListAppend-only merging; ordering by logical timestamp

Last-Write-Wins (LWW)

The write with the highest logical timestamp (Lamport clock or hybrid logical clock) wins. For String types:

US at T=1000: SET product:999:stock "5"
EU at T=1001: SET product:999:stock "3"   (slightly later timestamp)
→ EU's write wins: final value = "3"

LWW is simple but has a failure mode: if two writes happen at nearly the same time (within clock synchronisation tolerance), the winner is determined by clock drift — arbitrary from the application's perspective.

Counter Convergence

For INCR/DECR, CRDTs sum all increments from all regions:

US: INCRBY inventory 100
EU: INCRBY inventory 50
APAC: DECRBY inventory 30
→ Convergent value: 100 + 50 - 30 = 120 (regardless of order or timing)

This is the correct semantics for distributed counters — page views, inventory adjustments, score increments.

Set Merge Semantics

For Sets, the CRDT merges all adds and respects the "observed-remove" rule — a delete only removes adds that the deleting replica has observed:

US: SADD tags "redis"       → {redis}
EU: SADD tags "postgres"    → {postgres}
(changes propagate)
Final: {redis, postgres}    ← union

US: SREM tags "redis"       → removes the "redis" add that US observed
EU: SADD tags "redis"       → adds "redis" again simultaneously
(changes propagate)
Final: {redis, postgres}    ← EU's add survives because US only removed its own add

This can produce unintuitive results (deletes don't propagate as expected), but it ensures convergence.


Redis Enterprise vs Redis OSS for Active-Active

Open-source Redis does not natively support active-active geo-replication. The replication system is designed for primary-replica, not peer-to-peer.

Redis Enterprise (commercial product from Redis Ltd.) provides:

  • Active-Active geo-distribution with CRDT semantics
  • Automatic conflict resolution
  • Global keyspace — all regions share the same logical database
  • WAN-optimised replication (delta compression, bandwidth throttling)

Redis Cloud (managed Redis Enterprise) is the SaaS offering.

Alternatives for OSS Redis:

  • Roshi (Twitter's approach): application-layer CRDT using Sorted Sets with timestamp scores; reads perform a merge of multiple region results
  • Application-level coordination: accept that writes go to one authoritative region and reads may be stale in other regions
  • Consistent hashing + per-region primary: different keys are "owned" by different regions; cross-region reads accept the latency

The Consistency Trade-off

Active-active replication is eventually consistent. Between when a write is applied in one region and when it propagates to all others (100–300ms for cross-region), different users see different values:

US user writes: SET username:1001 "Jatin"
150ms later, EU user reads: GET username:1001
→ EU replica: "OldName"  (replication hasn't arrived yet)

For some use cases this is acceptable:

  • Product recommendations (briefly stale is fine)
  • Feature flags (a user in EU and a user in US seeing different states for 200ms is acceptable)
  • Leaderboards (eventual consistency is expected)

For some it is not:

  • Account balances (EU and US must agree on the balance)
  • Inventory counts (two regions cannot both sell the last item)
  • Sessions (a session created in the US must be immediately valid in EU)

For strong consistency across regions: use a database with cross-region transactions (CockroachDB, Spanner, YugabyteDB), not Redis.


Practical Patterns for Multi-Region Redis Without Active-Active

Regional Primary with Global Read Replicas

typescript
// Each region has its own Redis primary for local writes // Plus read replicas of all other regions' primaries // When user writes (US user): await usRedis.set(`user:${id}:profile`, data); // When user reads in EU (accept small lag): const profile = await euRedis.get(`user:${id}:profile`); // Falls back to US primary if EU replica hasn't received the write if (!profile) { const profileFromUS = await usRedis.get(`user:${id}:profile`); return profileFromUS; }

Sticky Sessions by Region

Route each user's requests to their "home region" for consistent reads and writes. Use geolocation at the CDN/load balancer layer:

US users → us-east-1 Redis (reads and writes)
EU users → eu-west-1 Redis (reads and writes)
# Cross-region migration: replicate with lag; brief inconsistency on region change

Read-Your-Own-Writes via Sticky Routing

After a write, route subsequent reads to the same region (where the write is definitely present) for the next few seconds:

typescript
async function write(key: string, value: string, userId: string) { await redis.set(key, value); // Tag this user as "just wrote" for 5 seconds await redis.set(`write-tag:${userId}`, '1', 'EX', 5); } async function read(key: string, userId: string) { const justWrote = await redis.exists(`write-tag:${userId}`); if (justWrote) { // Route to primary to read our own write return primaryRedis.get(key); } // Route to local replica (possibly stale, but user hasn't written recently) return localReplicaRedis.get(key); }

Summary

  • Active-passive: one primary region, read replicas in other regions — low write latency only in the primary region
  • Active-active: every region writes locally — requires CRDT conflict resolution; supported by Redis Enterprise/Cloud, not OSS Redis
  • CRDTs resolve conflicts via: LWW for Strings, sum-all-increments for counters, merge for Sets, field-level LWW for Hashes
  • Active-active is eventually consistent — writes propagate with 100–300ms cross-region lag
  • Do not use Redis active-active for strong consistency requirements (account balances, inventory) — use a transactional database
  • OSS Redis alternatives: regional primaries + cross-region read replicas, sticky user routing, application-layer CRDT patterns
  • The question "should I use multi-region Redis?" often has the answer "no" — regional cache with fallback is simpler and correct for most use cases

Next: A-13 — Disaster Recovery, Backup, and Point-in-Time Restore — RDB backup scheduling, AOF log shipping, recovery runbooks, and testing restore procedures before you need them.

© 2026 Jatin Jain Saraf (JJS). All rights reserved.