The SAP Cloud SDK for Java provides abstractions for some frequently used resilience patterns like timeout, retry, rate limiter, circuit breaker etc. Applying such patterns helps to make an application more resilient against failures it might encounter. SAP Cloud SDK builds upon the Resilience4j library to provide resilience to your cloud applications. Resilience4j comes with many modules to protect your application from failures. The most important ones are circuit breakers, bulkheads and timeouts.
Circuit Breakers
Circuit Breaker is a design pattern where the application automatically stops making remote service calls if the remote service call has failed too many times. The CircuitBreaker is implemented via a finite state machine with 3 normal states.
- CLOSED
- OPEN
- HALF_OPEN
When the number of consecutive remote service call failures exceed a configured threshold, the circuit breaker switches to the OPEN state. No more remote service calls are made by the application for the duration of the timeout period. After the timeout expires, the circuit breaker switches to the HALF_OPEN state. Limited remote service calls are made by the application. If these remote service calls succeed, the circuit breaker switches to the CLOSED state and normal operations are resumed.
Bulkheads
Bulkhead pattern is used to limit the number of concurrent requests to a remote service. If the number of concurrent incoming requests exceed the configured threshold, the bulkhead is said to be saturated. In this case, further requests are automatically stopped until existing requests are completed.
Timeouts
If the response time of a remote service call exceeds the configured timeout duration, the remote service call is considered to have failed. Resilience4j allows setting custom timeout durations for every remote service call.
Additionally, the SAP Cloud SDK enables you to provide fallback functions. For example, if the bulkhead is saturated, or if the circuit breaker switches to the OPEN state, SAP Cloud SDK can check if a fallback method is implemented and execute it automatically. So even if a remote service is unavailable, you can still provide some meaningful result. For example, you can return cached data.