Gateway is the traffic entry point for microservices architecture, rate limiting and circuit breaker are two lines of defense protecting downstream services. This article records my actual process of configuring these two features in production, not documentation translation, only records the parts actually used and pitfalls encountered.

Environment

  • Spring Cloud Gateway 4.x
  • Spring Boot 3.x
  • Resilience4j (Circuit Breaker)
  • Redis (Rate Limiting Storage)

Rate Limiting: Redis Token Bucket

Gateway has built-in Redis-based token bucket rate limiting, principle is each route maintains a token bucket, requests consume tokens, tokens replenish at fixed rate.

Dependencies

<dependency>
    <groupId>org.springframework.cloud</groupId>
    <artifactId>spring-cloud-starter-gateway</artifactId>
</dependency>
<dependency>
    <groupId>org.springframework.boot</groupId>
    <artifactId>spring-boot-starter-data-redis-reactive</artifactId>
</dependency>

Configuration

spring:
  cloud:
    gateway:
      routes:
        - id: user-service
          uri: lb://user-service
          predicates:
            - Path=/api/user/**
          filters:
            - name: RequestRateLimiter
              args:
                redis-rate-limiter.replenishRate: 100    # Replenish 100 tokens per second
                redis-rate-limiter.burstCapacity: 200    # Bucket capacity (allows bursts)
                redis-rate-limiter.requestedTokens: 1    # Each request consumes 1 token
                key-resolver: "#{@ipKeyResolver}"        # Rate limit by IP

Key Resolver: By IP or By User

Built-in is IP-based rate limiting, actual business usually needs user ID:

@Configuration
public class RateLimiterConfig {

    // IP-based rate limiting (suitable for non-login endpoints)
    @Bean
    public KeyResolver ipKeyResolver() {
        return exchange -> Mono.just(
            exchange.getRequest().getRemoteAddress().getAddress().getHostAddress()
        );
    }

    // User ID-based rate limiting (suitable for logged-in endpoints, get from Header or JWT)
    @Bean
    @Primary
    public KeyResolver userKeyResolver() {
        return exchange -> {
            String userId = exchange.getRequest().getHeaders().getFirst("X-User-Id");
            return Mono.just(userId != null ? userId : "anonymous");
        };
    }
}

Response When Rate Limit Triggered

Default is 429 empty response, poor experience. Customize:

@Component
public class CustomRateLimitErrorHandler implements ErrorWebExceptionHandler {

    @Override
    public Mono<Void> handle(ServerWebExchange exchange, Throwable ex) {
        if (ex instanceof ResponseStatusException rse
                && rse.getStatusCode() == HttpStatus.TOO_MANY_REQUESTS) {

            exchange.getResponse().setStatusCode(HttpStatus.TOO_MANY_REQUESTS);
            exchange.getResponse().getHeaders()
                    .add("Content-Type", "application/json;charset=UTF-8");

            String body = """
                {"code": 429, "message": "Too many requests, please try again later"}
                """;
            DataBuffer buffer = exchange.getResponse().bufferFactory()
                    .wrap(body.getBytes(StandardCharsets.UTF_8));
            return exchange.getResponse().writeWith(Mono.just(buffer));
        }
        return Mono.error(ex);
    }
}

Circuit Breaker: Resilience4j

Rate limiting prevents overwhelming downstream with too much traffic, circuit breaker quickly fails when downstream already has problems, avoids request accumulation dragging down entire chain.

Dependencies

<dependency>
    <groupId>org.springframework.cloud</groupId>
    <artifactId>spring-cloud-starter-circuitbreaker-resilience4j</artifactId>
</dependency>

Configure Circuit Breaker

spring:
  cloud:
    gateway:
      routes:
        - id: order-service
          uri: lb://order-service
          predicates:
            - Path=/api/order/**
          filters:
            - name: CircuitBreaker
              args:
                name: orderServiceCB
                fallbackUri: forward:/fallback/order

resilience4j:
  circuitbreaker:
    instances:
      orderServiceCB:
        slidingWindowType: COUNT_BASED
        slidingWindowSize: 20          # Count recent 20 requests
        failureRateThreshold: 50       # Open circuit when failure rate exceeds 50%
        waitDurationInOpenState: 30s   # Wait 30s after circuit opens before half-open
        permittedNumberOfCallsInHalfOpenState: 5  # Allow 5 probe calls in half-open state
        minimumNumberOfCalls: 10       # At least 10 requests before statistics

Fallback Endpoint

@RestController
@RequestMapping("/fallback")
public class FallbackController {

    @RequestMapping("/order")
    public Mono<Map<String, Object>> orderFallback(ServerWebExchange exchange) {
        // Log circuit breaker event for debugging
        log.warn("Circuit breaker triggered for order-service, path: {}",
            exchange.getRequest().getPath());

        return Mono.just(Map.of(
            "code", 503,
            "message", "Service temporarily unavailable, please try again later",
            "fallback", true
        ));
    }
}

Pitfall Records

Pitfall 1: Redis Rate Limiting Key Duplicated, All Endpoints Share One Bucket

Cause: key-resolver Bean name misspelled, not correctly injected, Gateway used default resolver, all routes share same Key.

Debug method: Check in Redis for request_rate_limiter.* Keys, normally should be separated by IP or user ID.

Pitfall 2: After Circuit Breaker, Fallback Returns 200, Monitoring System Can’t See Exceptions

Business-wise fallback should return 503, not 200. Change fallback response code:

@RequestMapping("/order")
public Mono<ResponseEntity<Map<String, Object>>> orderFallback() {
    return Mono.just(ResponseEntity
        .status(HttpStatus.SERVICE_UNAVAILABLE)
        .body(Map.of("code", 503, "message", "Service temporarily unavailable")));
}

Pitfall 3: minimumNumberOfCalls Set Too Small, Newly Deployed Services Frequently False Trigger

During low traffic periods, several occasional timeouts trigger circuit breaker. Increase minimumNumberOfCalls (recommend 20-50 for production), also adjust slidingWindowSize.


Monitoring: View Circuit Breaker Status

Resilience4j exposes Actuator endpoint, can see circuit breaker current state:

# View all circuit breaker status
curl http://gateway:8080/actuator/circuitbreakers

# Output example
{
  "circuitBreakers": {
    "orderServiceCB": {
      "state": "CLOSED",  # CLOSED=normal, OPEN=circuit open, HALF_OPEN=probing
      "failureRate": "5.0%",
      "slowCallRate": "0.0%"
    }
  }
}

After integrating with Grafana, can make real-time dashboard, alert when circuit breaker events trigger.


Summary

FeatureComponentCore Configuration
Rate LimitingRedis RequestRateLimiterreplenishRate (rate) + burstCapacity (burst)
Per-User Rate LimitingCustom KeyResolverGet user ID from Header/JWT
Circuit BreakerResilience4j CircuitBreakerfailureRateThreshold + slidingWindowSize
Degraded ResponseFallback ControllerReturn 503, log events

Using these two features together basically covers stability protection at gateway level. Actual parameter tuning needs to be based on business traffic patterns, no universal “best value”, recommend first running through in stress test environment before production.