Coordinated Shutdown

Under normal conditions when ActorSystem is terminated or the JVM process is shut down certain actors and services will be stopped in a specific order.

This is handled by an extension named CoordinatedShutdown. It will run the registered tasks during the shutdown process. The order of the shutdown phases is defined in configuration akka.coordinated-shutdown.phases. The default phases are defined as:

# CoordinatedShutdown is enabled by default and will run the tasks that
# are added to these phases by individual Akka modules and user logic.
#
# The phases are ordered as a DAG by defining the dependencies between the phases
# to make sure shutdown tasks are run in the right order.
#
# In general user tasks belong in the first few phases, but there may be use
# cases where you would want to hook in new phases or register tasks later in
# the DAG.
#
# Each phase is defined as a named config section with the
# following optional properties:
# - timeout=15s: Override the default-phase-timeout for this phase.
# - recover=off: If the phase fails the shutdown is aborted
#                and depending phases will not be executed.
# - enabled=off: Skip all tasks registered in this phase. DO NOT use
#                this to disable phases unless you are absolutely sure what the
#                consequences are. Many of the built in tasks depend on other tasks
#                having been executed in earlier phases and may break if those are disabled.
# depends-on=[]: Run the phase after the given phases
phases {

  # The first pre-defined phase that applications can add tasks to.
  # Note that more phases can be added in the application's
  # configuration by overriding this phase with an additional
  # depends-on.
  before-service-unbind {
  }

  # Stop accepting new incoming connections.
  # This is where you can register tasks that makes a server stop accepting new connections. Already
  # established connections should be allowed to continue and complete if possible.
  service-unbind {
    depends-on = [before-service-unbind]
  }

  # Wait for requests that are in progress to be completed.
  # This is where you register tasks that will wait for already established connections to complete, potentially
  # also first telling them that it is time to close down.
  service-requests-done {
    depends-on = [service-unbind]
  }

  # Final shutdown of service endpoints.
  # This is where you would add tasks that forcefully kill connections that are still around.
  service-stop {
    depends-on = [service-requests-done]
  }

  # Phase for custom application tasks that are to be run
  # after service shutdown and before cluster shutdown.
  before-cluster-shutdown {
    depends-on = [service-stop]
  }

  # Graceful shutdown of the Cluster Sharding regions.
  # This phase is not meant for users to add tasks to.
  cluster-sharding-shutdown-region {
    timeout = 10 s
    depends-on = [before-cluster-shutdown]
  }

  # Emit the leave command for the node that is shutting down.
  # This phase is not meant for users to add tasks to.
  cluster-leave {
    depends-on = [cluster-sharding-shutdown-region]
  }

  # Shutdown cluster singletons
  # This is done as late as possible to allow the shard region shutdown triggered in
  # the "cluster-sharding-shutdown-region" phase to complete before the shard coordinator is shut down.
  # This phase is not meant for users to add tasks to.
  cluster-exiting {
    timeout = 10 s
    depends-on = [cluster-leave]
  }

  # Wait until exiting has been completed
  # This phase is not meant for users to add tasks to.
  cluster-exiting-done {
    depends-on = [cluster-exiting]
  }

  # Shutdown the cluster extension
  # This phase is not meant for users to add tasks to.
  cluster-shutdown {
    depends-on = [cluster-exiting-done]
  }

  # Phase for custom application tasks that are to be run
  # after cluster shutdown and before ActorSystem termination.
  before-actor-system-terminate {
    depends-on = [cluster-shutdown]
  }

  # Last phase. See terminate-actor-system and exit-jvm above.
  # Don't add phases that depends on this phase because the
  # dispatcher and scheduler of the ActorSystem have been shutdown.
  # This phase is not meant for users to add tasks to.
  actor-system-terminate {
    timeout = 10 s
    depends-on = [before-actor-system-terminate]
  }
}

More phases can be added in the application’s configuration if needed by overriding a phase with an additional depends-on. Especially the phases before-service-unbind, before-cluster-shutdown and before-actor-system-terminate are intended for application specific phases or tasks.

The default phases are defined in a single linear order, but the phases can be ordered as a directed acyclic graph (DAG) by defining the dependencies between the phases. The phases are ordered with topological sort of the DAG.

Tasks can be added to a phase with:

Scala
CoordinatedShutdown(system).addTask(CoordinatedShutdown.PhaseBeforeServiceUnbind, "someTaskName") { () =>
  import akka.pattern.ask
  import system.dispatcher
  implicit val timeout = Timeout(5.seconds)
  (someActor ? "stop").map(_ => Done)
}
Java
CoordinatedShutdown.get(system)
    .addTask(
        CoordinatedShutdown.PhaseBeforeServiceUnbind(),
        "someTaskName",
        () -> {
          return akka.pattern.Patterns.ask(someActor, "stop", Duration.ofSeconds(5))
              .thenApply(reply -> Done.getInstance());
        });

If cancellation of previously added tasks is required:

Scala
val c = CoordinatedShutdown(system).addCancellableTask(CoordinatedShutdown.PhaseBeforeServiceUnbind, "cleanup") {
  () =>
    Future {
      cleanup()
      Done
    }
}

// much later...
c.cancel()
Java
Cancellable cancellable =
    CoordinatedShutdown.get(system)
        .addCancellableTask(
            CoordinatedShutdown.PhaseBeforeServiceUnbind(), "someTaskCleanup", () -> cleanup());
// much later...
cancellable.cancel();

The returned Future[Done] CompletionStage<Done> should be completed when the task is completed. The task name parameter is only used for debugging/logging.

Tasks added to the same phase are executed in parallel without any ordering assumptions. Next phase will not start until all tasks of previous phase have been completed.

If tasks are not completed within a configured timeout (see reference.conf) the next phase will be started anyway. It is possible to configure recover=off for a phase to abort the rest of the shutdown process if a task fails or is not completed within the timeout.

In the above example, it may be more convenient to simply stop the actor when it’s done shutting down, rather than send back a done message, and for the shutdown task to not complete until the actor is terminated. A convenience method is provided that adds a task that sends a message to the actor and then watches its termination:

Scala
CoordinatedShutdown(system).addActorTerminationTask(
  CoordinatedShutdown.PhaseBeforeServiceUnbind,
  "someTaskName",
  someActor,
  Some("stop"))
Java
CoordinatedShutdown.get(system)
    .addActorTerminationTask(
        CoordinatedShutdown.PhaseBeforeServiceUnbind(),
        "someTaskName",
        someActor,
        Optional.of("stop"));

Tasks should typically be registered as early as possible after system startup. When running the coordinated shutdown tasks that have been registered will be performed but tasks that are added too late will not be run.

To start the coordinated shutdown process you can invoke run runAll on the CoordinatedShutdown extension:

Scala
val done: Future[Done] = CoordinatedShutdown(system).run(CoordinatedShutdown.UnknownReason)
Java
CompletionStage<Done> done =
    CoordinatedShutdown.get(system).runAll(CoordinatedShutdown.unknownReason());

It’s safe to call the run runAll method multiple times. It will only run once.

That also means that the ActorSystem will be terminated in the last phase. By default, the JVM is not forcefully stopped (it will be stopped if all non-daemon threads have been terminated). To enable a hard System.exit as a final action you can configure:

akka.coordinated-shutdown.exit-jvm = on

The coordinated shutdown process can also be started by calling ActorSystem.terminate().

When using Akka Cluster the CoordinatedShutdown will automatically run when the cluster node sees itself as Exiting, i.e. leaving from another node will trigger the shutdown process on the leaving node. Tasks for graceful leaving of cluster including graceful shutdown of Cluster Singletons and Cluster Sharding are added automatically when Akka Cluster is used, i.e. running the shutdown process will also trigger the graceful leaving if it’s not already in progress.

By default, the CoordinatedShutdown will be run when the JVM process exits, e.g. via kill SIGTERM signal (SIGINT ctrl-c doesn’t work). This behavior can be disabled with:

akka.coordinated-shutdown.run-by-jvm-shutdown-hook=off

If you have application specific JVM shutdown hooks it’s recommended that you register them via the CoordinatedShutdown so that they are running before Akka internal shutdown hooks, e.g. those shutting down Akka Remoting (Artery).

Scala
CoordinatedShutdown(system).addJvmShutdownHook {
  println("custom JVM shutdown hook...")
}
Java
CoordinatedShutdown.get(system)
    .addJvmShutdownHook(() -> System.out.println("custom JVM shutdown hook..."));

For some tests it might be undesired to terminate the ActorSystem via CoordinatedShutdown. You can disable that by adding the following to the configuration of the ActorSystem that is used in the test:

# Don't terminate ActorSystem via CoordinatedShutdown in tests
akka.coordinated-shutdown.terminate-actor-system = off
akka.coordinated-shutdown.run-by-actor-system-terminate = off
akka.coordinated-shutdown.run-by-jvm-shutdown-hook = off
akka.cluster.run-coordinated-shutdown-when-down = off
Found an error in this documentation? The source code for this page can be found here. Please feel free to edit and contribute a pull request.