Circuit Breakers

A circuit breaker provides stability and prevents cascading failures in distributed systems. These should be used in conjunction with judicious timeouts at the interfaces between remote systems to prevent the failure of a single component from bringing down all components.

As an example, we have a web application interacting with a remote third party web service. Let’s say the third party has oversold their capacity, and their database melts down under load. Assume that the database fails in such a way that it takes a very long time to hand back an error to the third party web service. This in turn makes calls fail after a long period. Back to our web application, the users have noticed that their form submissions take much longer seeming to hang. Well the users do what they know to do which is use the refresh button, adding more requests to their already running requests. This eventually causes the failure of the web application due to resource exhaustion. This will affect all users, even those who are not using functionality dependent on this third party web service.

Introducing circuit breakers on the web service call would cause the requests to begin to fail-fast new tab, letting the user know that something is wrong and that they need not refresh their request. This also confines the failure behavior to only those users that are using functionality dependent on the third party, other users are no longer affected as there is no resource exhaustion. Circuit breakers can also allow savvy developers to mark portions of the site that use the functionality unavailable, or perhaps show some cached content as appropriate while the breaker is open.

The Akka library provides an implementation of a circuit breaker called akka.pattern.CircuitBreaker which has the behavior described below.

What do they do?

  • During normal operation, a circuit breaker is in the Closed state:

    • Exceptions or calls exceeding the configured call timeout increment a failure counter

    • Successes reset the failure count to zero

    • When the failure counter reaches a given max failures count, the breaker is tripped into Open state

  • While in Open state:

    • All calls fail-fast with a CircuitBreakerOpenException

    • After the configured reset timeout, the circuit breaker enters a Half-Open state

  • In Half-Open state:

    • The first call attempted is allowed through without failing fast

    • All other calls fail-fast with an exception just as in Open state

    • If the first call succeeds, the breaker is reset back to Closed state and the reset timeout is reset

    • If the first call fails, the breaker is tripped again into the Open state

The following diagram illustrates the Finite State Machine described above:

States of a Circuit Breaker
States of a Circuit Breaker

Learn more

For more information, including code examples in Akka, see Akka CircuitBreaker reference documentation new tab.