System Design12 min readMay 05, 2026

Rate Limiter Deep Dive

Design fault-tolerant rate limiting microservices capable of scaling to support millions of client requests.

Rate limiters protect API gateways from Denial of Service (DoS) attacks, brute-force requests, and downstream service starvation. Designing them for high throughput requires low-latency databases like Redis.

Algorithms Comparison

Token Bucket: Refills tokens periodically. Allows traffic spikes up to the bucket capacity. Memory-efficient.
Leaky Bucket: Requests are queued and processed at a constant leak rate. Smoothens traffic spikes but introduces latency.
Sliding Window Log: Logs timestamps in a sorted set (ZSET) for every request. High precision but consumes extensive memory.

goEditor

// Token Bucket Rate Limiting implementation in Go
type TokenBucket struct {
    rate         float64 // tokens per second
    capacity     float64
    tokens       float64
    lastRefilled time.Time
    mu           sync.Mutex
}

func (tb *TokenBucket) Allow() bool {
    tb.mu.Lock()
    defer tb.mu.Unlock()
    
    now := time.Now()
    elapsed := now.Sub(tb.lastRefilled).Seconds()
    tb.tokens = math.Min(tb.capacity, tb.tokens+(elapsed*tb.rate))
    tb.lastRefilled = now
    
    if tb.tokens >= 1.0 {
        tb.tokens -= 1.0
        return true
    }
    return false
}

Distributed Bottlenecks

In a clustered environment, race conditions can occur between checking the token count and decrementing it. To make these actions atomic, write Redis Lua scripts that execute inside Redis's single-threaded event loop.

Want to play with this concept?

We build interactive visual terminals for tokenizers, rendering engines, rate limiters, and network topologies. Explore them live!

Open Interactive Labs →