Authors: Souvik Sarkar, Professor Sanchita Ghosh
Abstract: In contemporary large-scale distributed systems, the challenge of handling user request rates across multiple servers without centralized bottlenecks is a core problem. This project introduces the design and implementation of a scalable, decentralized, distributed rate limiter based on the Token Bucket algorithm and CRDT (Conflict-Free Replicated Data Types) principles to provide eventual consistency between nodes. The system uniquely identifies users, applies configurable rate limits, and synchronizes token states across multiple instances of the server without depending on a central database or coordinator. Kafka, in KRaft (Kafka Raft Metadata mode) mode, serves as the decentralized message bus for state propagation between services with low synchronization latency while handling millions of concurrent users. To provide high availability and fault tolerance, several instances of the rate limiter service are run behind an NGINX load balancer on Docker containers, supporting dynamic scaling and automatic traffic routing. The architecture supports temporary divergence in token values, but CRDT merging guarantees that the system automatically corrects itself without over-permitting requests above the specified rate limits. A stress testing suite is also implemented to ensure the system's performance under high concurrency conditions. This project efficiently showcases the achievement of decentralized rate limiting at scale with eventual consistency guarantees through contemporary concepts in distributed systems, containerization, and message-driven architecture and hence making it fit for deployment in real-world scenarios such as API rate limiting, distributed authentication throttling, and multi-region request control systems
DOI: http://doi.org/