Gateway is the traffic entry point for microservices architecture, rate limiting and circuit breaker are two lines of defense protecting downstream services. This article records my actual process of configuring these two features in production, not documentation translation, only records the parts actually used and pitfalls encountered.
Environment
- Spring Cloud Gateway 4.x
- Spring Boot 3.x
- Resilience4j (Circuit Breaker)
- Redis (Rate Limiting Storage)
Rate Limiting: Redis Token Bucket
Gateway has built-in Redis-based token bucket rate limiting, principle is each route maintains a token bucket, requests consume tokens, tokens replenish at fixed rate.
Dependencies
<dependency>
<groupId>org.springframework.cloud</groupId>
<artifactId>spring-cloud-starter-gateway</artifactId>
</dependency>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-data-redis-reactive</artifactId>
</dependency>
Configuration
spring:
cloud:
gateway:
routes:
- id: user-service
uri: lb://user-service
predicates:
- Path=/api/user/**
filters:
- name: RequestRateLimiter
args:
redis-rate-limiter.replenishRate: 100 # Replenish 100 tokens per second
redis-rate-limiter.burstCapacity: 200 # Bucket capacity (allows bursts)
redis-rate-limiter.requestedTokens: 1 # Each request consumes 1 token
key-resolver: "#{@ipKeyResolver}" # Rate limit by IP
Key Resolver: By IP or By User
Built-in is IP-based rate limiting, actual business usually needs user ID:
@Configuration
public class RateLimiterConfig {
// IP-based rate limiting (suitable for non-login endpoints)
@Bean
public KeyResolver ipKeyResolver() {
return exchange -> Mono.just(
exchange.getRequest().getRemoteAddress().getAddress().getHostAddress()
);
}
// User ID-based rate limiting (suitable for logged-in endpoints, get from Header or JWT)
@Bean
@Primary
public KeyResolver userKeyResolver() {
return exchange -> {
String userId = exchange.getRequest().getHeaders().getFirst("X-User-Id");
return Mono.just(userId != null ? userId : "anonymous");
};
}
}
Response When Rate Limit Triggered
Default is 429 empty response, poor experience. Customize:
@Component
public class CustomRateLimitErrorHandler implements ErrorWebExceptionHandler {
@Override
public Mono<Void> handle(ServerWebExchange exchange, Throwable ex) {
if (ex instanceof ResponseStatusException rse
&& rse.getStatusCode() == HttpStatus.TOO_MANY_REQUESTS) {
exchange.getResponse().setStatusCode(HttpStatus.TOO_MANY_REQUESTS);
exchange.getResponse().getHeaders()
.add("Content-Type", "application/json;charset=UTF-8");
String body = """
{"code": 429, "message": "Too many requests, please try again later"}
""";
DataBuffer buffer = exchange.getResponse().bufferFactory()
.wrap(body.getBytes(StandardCharsets.UTF_8));
return exchange.getResponse().writeWith(Mono.just(buffer));
}
return Mono.error(ex);
}
}
Circuit Breaker: Resilience4j
Rate limiting prevents overwhelming downstream with too much traffic, circuit breaker quickly fails when downstream already has problems, avoids request accumulation dragging down entire chain.
Dependencies
<dependency>
<groupId>org.springframework.cloud</groupId>
<artifactId>spring-cloud-starter-circuitbreaker-resilience4j</artifactId>
</dependency>
Configure Circuit Breaker
spring:
cloud:
gateway:
routes:
- id: order-service
uri: lb://order-service
predicates:
- Path=/api/order/**
filters:
- name: CircuitBreaker
args:
name: orderServiceCB
fallbackUri: forward:/fallback/order
resilience4j:
circuitbreaker:
instances:
orderServiceCB:
slidingWindowType: COUNT_BASED
slidingWindowSize: 20 # Count recent 20 requests
failureRateThreshold: 50 # Open circuit when failure rate exceeds 50%
waitDurationInOpenState: 30s # Wait 30s after circuit opens before half-open
permittedNumberOfCallsInHalfOpenState: 5 # Allow 5 probe calls in half-open state
minimumNumberOfCalls: 10 # At least 10 requests before statistics
Fallback Endpoint
@RestController
@RequestMapping("/fallback")
public class FallbackController {
@RequestMapping("/order")
public Mono<Map<String, Object>> orderFallback(ServerWebExchange exchange) {
// Log circuit breaker event for debugging
log.warn("Circuit breaker triggered for order-service, path: {}",
exchange.getRequest().getPath());
return Mono.just(Map.of(
"code", 503,
"message", "Service temporarily unavailable, please try again later",
"fallback", true
));
}
}
Pitfall Records
Pitfall 1: Redis Rate Limiting Key Duplicated, All Endpoints Share One Bucket
Cause: key-resolver Bean name misspelled, not correctly injected, Gateway used default resolver, all routes share same Key.
Debug method: Check in Redis for request_rate_limiter.* Keys, normally should be separated by IP or user ID.
Pitfall 2: After Circuit Breaker, Fallback Returns 200, Monitoring System Can’t See Exceptions
Business-wise fallback should return 503, not 200. Change fallback response code:
@RequestMapping("/order")
public Mono<ResponseEntity<Map<String, Object>>> orderFallback() {
return Mono.just(ResponseEntity
.status(HttpStatus.SERVICE_UNAVAILABLE)
.body(Map.of("code", 503, "message", "Service temporarily unavailable")));
}
Pitfall 3: minimumNumberOfCalls Set Too Small, Newly Deployed Services Frequently False Trigger
During low traffic periods, several occasional timeouts trigger circuit breaker. Increase minimumNumberOfCalls (recommend 20-50 for production), also adjust slidingWindowSize.
Monitoring: View Circuit Breaker Status
Resilience4j exposes Actuator endpoint, can see circuit breaker current state:
# View all circuit breaker status
curl http://gateway:8080/actuator/circuitbreakers
# Output example
{
"circuitBreakers": {
"orderServiceCB": {
"state": "CLOSED", # CLOSED=normal, OPEN=circuit open, HALF_OPEN=probing
"failureRate": "5.0%",
"slowCallRate": "0.0%"
}
}
}
After integrating with Grafana, can make real-time dashboard, alert when circuit breaker events trigger.
Summary
| Feature | Component | Core Configuration |
|---|---|---|
| Rate Limiting | Redis RequestRateLimiter | replenishRate (rate) + burstCapacity (burst) |
| Per-User Rate Limiting | Custom KeyResolver | Get user ID from Header/JWT |
| Circuit Breaker | Resilience4j CircuitBreaker | failureRateThreshold + slidingWindowSize |
| Degraded Response | Fallback Controller | Return 503, log events |
Using these two features together basically covers stability protection at gateway level. Actual parameter tuning needs to be based on business traffic patterns, no universal “best value”, recommend first running through in stress test environment before production.