How to inspect the health of a service
Inspecting a services' proper functioning from the outside is an important feature to operate it.
Health checks are generally distinguished into two categories:
- Readiness checks
-
is the service ready to receive external traffic
- Liveness checks
-
should the service be left running
Readiness checks can be used to decide if a load balancer should route traffic whereas liveness checks can be used in environments which can restart a hung process.
This matches Kubernetes Health checks. See Kubernetes Liveness and Readiness Probes: How to Avoid Shooting Yourself in the Foot for a good overview of how to use readiness and liveness probes.
Using Akka Management for health checks
The Akka Management library includes support for exposing readiness and liveness checks via HTTP.
The dependencies for Akka Management include the core module and the cluster HTTP module for cluster inspection. As Akka Management uses Spray JSON internally, make sure to add that dependency of the exact same version as other Akka HTTP libraries.
val AkkaHttpVersion = "10.7.0"
val AkkaManagementVersion = "1.6.0"
libraryDependencies ++= Seq(
"com.lightbend.akka.management" %% "akka-management" % AkkaManagementVersion,
"com.typesafe.akka" %% "akka-http-spray-json" % AkkaHttpVersion,
"com.lightbend.akka.management" %% "akka-management-cluster-http" % AkkaManagementVersion,
)
Upon start Akka Management creates an HTTP endpoint that allows insight into the service. That endpoint is separate from the service’s HTTP endpoints and will in most cases use a different network interface.
akka.management {
http {
hostname = "127.0.0.1"
port = 8558
}
}
The management endpoint starts on the configured interface and port.
AkkaManagement(system).start()
We consider our service ready for requests when it has joined the Akka Cluster and the database can be reached. Akka Management’s "cluster HTTP" module automatically enables a check for cluster membership.
Akka Persistence Cassandra includes a check to validate connectivity, as the service can’t operate at all if it can’t persist the events that check becomes part of our readiness check.
akka.management {
health-checks {
readiness-checks {
akka-persistence-cassandra = "akka.persistence.cassandra.healthcheck.CassandraHealthCheck"
}
}
}
Following Colin Breck’s advice, we do not include Cassandra connectivity checks in our liveness probe.
Try the readiness check with curl
curl http://localhost:9101/ready
If Cassandra is not started, the check will report:
Not Healthy: Check [akka.persistence.cassandra.healthcheck.CassandraHealthCheck] not ok
After starting Cassandra, later readiness check will result in
OK
We haven’t added anything application-specific to the liveness check, but we can try it with curl
curl http://localhost:9101/alive
Should result in
OK
Inspecting Akka Cluster state
The cluster HTTP module of Akka Management exposes even other cluster status information that we might be interested in to inspect.
With simple HTTP requests we can see which nodes make up the current Akka Cluster.
curl http://localhost:9101/cluster/members
With only one node started the response looks like this:
{
"leader":"akka://[email protected]:2551",
"members":[
{
"node":"akka://[email protected]:2551",
"nodeUid":"1325710108960625550",
"roles":["dc-default"],
"status":"Up"
}
],
"oldest":"akka://[email protected]:2551",
"oldestPerRole":{"dc-default":"akka://[email protected]:2551"},
"selfNode":"akka://[email protected]:2551",
"unreachable":[]
}
The Akka Management reference documentation shows other parts of this API.