How to inspect the health of a service

Inspecting a services' proper functioning from the outside is an important feature to operate it.

Health checks are generally distinguished into two categories:

Readiness checks

is the service ready to receive external traffic

Liveness checks

should the service be left running

Readiness checks can be used to decide if a load balancer should route traffic whereas liveness checks can be used in environments which can restart a hung process.

This matches Kubernetes Health checks. See Kubernetes Liveness and Readiness Probes: How to Avoid Shooting Yourself in the Foot for a good overview of how to use readiness and liveness probes.

Using Akka Management for health checks

The Akka Management new tab library includes support for exposing readiness and liveness checks via HTTP.

The dependencies for Akka Management include the core module and the cluster HTTP module for cluster inspection. As Akka Management uses Spray JSON internally, make sure to add that dependency of the exact same version as other Akka HTTP libraries.

build.sbt
val AkkaHttpVersion = "10.6.3"
val AkkaManagementVersion = "1.5.2"
libraryDependencies ++= Seq(
  "com.lightbend.akka.management" %% "akka-management" % AkkaManagementVersion,
  "com.typesafe.akka" %% "akka-http-spray-json" % AkkaHttpVersion,
  "com.lightbend.akka.management" %% "akka-management-cluster-http" % AkkaManagementVersion,
)

Upon start Akka Management creates an HTTP endpoint that allows insight into the service. That endpoint is separate from the service’s HTTP endpoints and will in most cases use a different network interface.

src/main/resources/application.conf
akka.management {
  http {
    hostname = "127.0.0.1"
    port = 8558
  }
}

The management endpoint starts on the configured interface and port.

AkkaManagement(system).start()

We consider our service ready for requests when it has joined the Akka Cluster and the database can be reached. Akka Management’s "cluster HTTP" module automatically enables a check for cluster membership.

Akka Persistence Cassandra includes a check to validate connectivity, as the service can’t operate at all if it can’t persist the events that check becomes part of our readiness check.

src/main/resources/application.conf
akka.management {
  health-checks {
    readiness-checks {
      akka-persistence-cassandra = "akka.persistence.cassandra.healthcheck.CassandraHealthCheck"
    }
  }
}

Following Colin Breck’s advice, we do not include Cassandra connectivity checks in our liveness probe.

Try the readiness check with curl

curl http://localhost:9101/ready

If Cassandra is not started, the check will report:

Not Healthy: Check [akka.persistence.cassandra.healthcheck.CassandraHealthCheck] not ok

After starting Cassandra, later readiness check will result in

OK

We haven’t added anything application-specific to the liveness check, but we can try it with curl

curl http://localhost:9101/alive

Should result in

OK

Inspecting Akka Cluster state

The cluster HTTP module of Akka Management exposes even other cluster status information that we might be interested in to inspect.

With simple HTTP requests we can see which nodes make up the current Akka Cluster.

curl http://localhost:9101/cluster/members

With only one node started the response looks like this:

{
    "leader":"akka://[email protected]:2551",
    "members":[
      {
        "node":"akka://[email protected]:2551",
        "nodeUid":"1325710108960625550",
        "roles":["dc-default"],
        "status":"Up"
      }
    ],
    "oldest":"akka://[email protected]:2551",
    "oldestPerRole":{"dc-default":"akka://[email protected]:2551"},
    "selfNode":"akka://[email protected]:2551",
    "unreachable":[]
}

The Akka Management reference documentation new tab shows other parts of this API.