Dynamic Degradation
Basic Introduction
Service degradation refers to when a server faces sudden traffic pressure or system resource shortage, strategically lowering the functional level of some non-core or secondary services based on current business conditions and traffic characteristics, to release server resources and ensure normal operation of core business functions. This is an active system protection mechanism.
Detailed Explanation
1. Trigger Conditions:
- System resources reach preset thresholds (such as CPU usage exceeds 80%)
- Request response time exceeds warning value
- System error rate suddenly increases
- Specific business indicator abnormal fluctuations
2. Degradation Strategies:
- Function masking: Temporarily close non-core functions
- Service simplification: Return simplified data
- Request rejection: Return degradation prompt for low-priority requests
- Delayed processing: Put non-urgent requests in queue for later processing
3. Implementation Methods:
- Manual degradation: Operations personnel proactively trigger based on monitoring data
- Automatic degradation: System automatically executes based on preset rules
- Tiered degradation: Implement different levels of degradation strategies based on pressure degree
Typical Application Scenarios
- E-commerce promotions: During Double 11 and other major promotions, product review features may be temporarily closed
- Seckill activities: Can simplify product detail page display
- System failures: When dependent third-party services have problems, local cached data can be used
- Sudden traffic: Can temporarily close computation-intensive features like personalized recommendations
Why Service Degradation is Needed
In distributed systems, service degradation is an important fault tolerance mechanism. Its core purpose is to prevent “avalanche effect.”
Definition and Principle of Avalanche Effect
The avalanche effect can be likened to an avalanche in nature: initially just a small patch of snow at the mountaintop slides down, but due to chain reactions, it eventually evolves into a large-scale landslide. In distributed systems, this phenomenon manifests as:
- Initial failure: A certain service node starts responding slowly or failing due to overload
- Request accumulation: Callers continuously wait for responses, occupying a large number of thread/connection resources
- Resource exhaustion: Caller’s own resources are also exhausted
- Cascading failure: Failure scope spreads to the entire system like dominoes
How Service Degradation Works
Service degradation prevents avalanche through the following methods:
- Fail-fast: When service abnormality is detected, immediately return degradation result
- Resource protection: Release occupied thread and connection resources
- Fault isolation: Prevent single service failure from spreading to the entire system
Implementation Methods
Masking and Fault Tolerance
Dubbo provides two commonly used mock strategies for handling service exception situations:
1. Force masking mode (mock=force:return+null)
<dubbo:reference interface="com.example.UserService" mock="force:return+null" />
2. Fail tolerance mode (mock=fail:return+null)
<dubbo:reference interface="com.example.RecommendService" mock="fail:return+null" />
Direct Return Value
<dubbo:reference id="xxService" timeout="3000" mock="return null" />
<dubbo:reference id="xxService2" timeout="3000" mock="return 1234" />
Configuration Center Implementation
registry.register(URL.valueOf("override://0.0.0.0/icu.wzk.service.WzkHelloService?&mock=force:return+null"));
Configuration uses URL format:
override://indicates this is an override rule0.0.0.0means effective for all IPs
Complete Code
public class DubboBreakMain {
public static void main(String[] args) {
RegistryFactory registryFactory =
ExtensionLoader.getExtensionLoader(RegistryFactory.class).getAdaptiveExtension();
Registry registry = registryFactory.getRegistry(URL.valueOf("zookeeper://10.10.52.38:2181"));
registry.register(URL.valueOf("override://0.0.0.0/icu.wzk.service.WzkHelloService?&mock=force:return+null"));
// Start consuming
AnnotationConfigApplicationContext context =
new AnnotationConfigApplicationContext(ConsumerConfiguration.class);
context.start();
ConsumerComponent service = context.getBean(ConsumerComponent.class);
while (true) {
try {
String hello = service.sayHello("world!");
System.out.println("result: " + hello);
Thread.sleep(3000);
} catch (Exception e) {
e.printStackTrace();
}
}
}
}
Test Run
After startup, you can see that the program fails fast and directly returns NULL.
Summary
Dynamic service degradation is a key strategy to ensure core business availability under system high pressure or abnormal conditions. By setting trigger conditions to automatically or manually mask non-core functions, simplify data or directly return default values, it prevents system avalanche. Combined with rate limiting, circuit breakers and other mechanisms, service degradation is an important means of building highly available distributed systems.