Rolling Updates
Rolling updates allow you to update an application by gradually replacing old nodes with new ones. This ensures that the application remains available throughout the update process, with minimal disruption to clients.
Graceful shutdown
Akka Cluster can handle hard failures using a downing provider such as Lightbend’s Split Brain Resolver. However, this should not be relied upon for regular rolling redeploys. Features such as ClusterSingleton
s and ClusterSharding
can safely restart actors on new nodes far quicker when it is certain that a node has shutdown rather than crashed.
Graceful leaving will happen with the default settings as it is part of Coordinated Shutdown. Just ensure that a node is sent a SIGTERM
and not a SIGKILL
. Environments such as Kubernetes will do this, it is important to ensure that if JVM is wrapped with a script that it forwards the signal.
Upon receiving a SIGTERM
Coordinated Shutdown will:
- Perform a
Cluster(system).leave
on itself - The status of the member will be changed to Exiting while allowing any shards to be shutdown gracefully and
ClusterSingleton
s to be migrated if this was the oldest node. Finally, the node is removed from the Akka Cluster membership.
Number of nodes to redeploy at once
Akka bootstrap requires a stable-period
where service discovery returns a stable set of contact points. When doing rolling updates it is best to wait for a node (or group of nodes) to finish joining the cluster before adding and removing other nodes.
Cluster Singletons
ClusterSingleton
s run on the oldest node in the cluster. To avoid singletons moving during every node deployment it is advised to start a rolling redeploy starting at the newest node. Then ClusterSingleton
s only move once. Cluster Sharding uses a singleton internally so this is important even if not using singletons directly.
Kubernetes Rolling Updates
Starting from Kubernetes v1.22, ReplicaSets are not scaled down with the youngest node first (see details here). That is because after some time all nodes that were brought up in the same time bucket are treated as equally old and the node to scale down first is chosen randomly.
As mentioned previously, the oldest node in an Akka cluster has a special role as it hosts singletons. If the oldest node in a cluster changes frequently, singletons need to be moved around as well which can have undesired consequences.
This module provides the Pod Deletion Cost extension which automatically annotates older pods so that they are selected last when removing nodes, providing for better overall stability for the cluster operations.
Project Info
Project Info: Akka Rolling Update Kubernetes | |
---|---|
Artifact | com.lightbend.akka.management
akka-rolling-update-kubernetes
1.5.3
|
JDK versions | Eclipse Temurin JDK 11 Eclipse Temurin JDK 17 Eclipse Temurin JDK 21 |
Scala versions | 2.13.14, 3.3.3 |
License | |
Readiness level |
Since 1.3.0, 2023-04-01
|
Home page | https://akka.io/ |
API documentation | |
Forums | |
Release notes | GitHub releases |
Issues | GitHub issues |
Sources | https://github.com/akka/akka-management |
Dependency
The Akka dependencies are available from Akka’s library repository. To access them there, you need to configure the URL for this repository.
- sbt
resolvers += "Akka library repository".at("https://repo.akka.io/maven")
- Gradle
repositories { mavenCentral() maven { url "https://repo.akka.io/maven" } }
- Maven
<project> ... <repositories> <repository> <id>akka-repository</id> <name>Akka library repository</name> <url>https://repo.akka.io/maven</url> </repository> </repositories> </project>
Add akka-rolling-update-kubernetes
to your dependency management tool:
- sbt
val AkkaManagementVersion = "1.5.3" libraryDependencies += "com.lightbend.akka.management" %% "akka-rolling-update-kubernetes" % AkkaManagementVersion
- Gradle
def versions = [ AkkaManagementVersion: "1.5.3", ScalaBinary: "2.13" ] dependencies { implementation "com.lightbend.akka.management:akka-rolling-update-kubernetes_${versions.ScalaBinary}:${versions.AkkaManagementVersion}" }
- Maven
<properties> <akka.management.version>1.5.3</akka.management.version> <scala.binary.version>2.13</scala.binary.version> </properties> <dependencies> <dependency> <groupId>com.lightbend.akka.management</groupId> <artifactId>akka-rolling-update-kubernetes_${scala.binary.version}</artifactId> <version>${akka.management.version}</version> </dependency> </dependencies>
Using
Akka Pod Deletion Cost extension must be started, this can either be done through config or programmatically.
Through config
Listing the PodDeletionCost
extension among the autoloaded akka.extensions
in application.conf
will also cause it to autostart:
akka.extensions = ["akka.rollingupdate.kubernetes.PodDeletionCost"]
If management or bootstrap configuration is incorrect, the autostart will log an error and terminate the actor system.
Programmatically
- Scala
-
source
// Starting the pod deletion cost annotator PodDeletionCost(system).start()
- Java
-
source
// Starting the pod deletion cost annotator PodDeletionCost.get(system).start();
Configuration
The following configuration is required, more details for each and additional configurations can be found in reference.conf:
akka.rollingupdate.kubernetes.pod-name
: this can be provided by settingKUBERNETES_POD_NAME
environment variable tometadata.name
on the Kubernetes container spec.
env:
- name: KUBERNETES_POD_NAME
valueFrom:
fieldRef:
fieldPath: metadata.name
Additionally, the pod annotator needs to know which namespace the pod belongs to. By default, this will be detected by reading the namespace from the service account secret, in /var/run/secrets/kubernetes.io/serviceaccount/namespace
, but can be overridden by setting akka.rollingupdate.kubernetes.namespace
or by providing KUBERNETES_NAMESPACE
environment variable.
env:
- name: KUBERNETES_NAMESPACE
valueFrom:
fieldRef:
fieldPath: metadata.namespace
Role based access control
This extension uses the Kubernetes API to set the pod-deletion-cost
annotation on its own pod. To be able to do that, it requires special permission to be able to patch
the pod configuration. Each pod only needs access to the namespace they are in. If this is a security concern in your environment you may instead use Alternative with Custom Resource Definition.
An example RBAC that can be used:
kind: Role
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: pod-patcher
rules:
- apiGroups: [""] # "" indicates the core API group
resources: ["pods"]
verbs: ["patch"] # requires "patch" to annotate the pod
---
kind: RoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: annotate-pods
subjects:
- kind: User
name: system:serviceaccount:<YOUR NAMESPACE>:default
roleRef:
kind: Role
name: pod-patcher
apiGroup: rbac.authorization.k8s.io
This defines a Role
that is allowed to patch
pod objects and a RoleBinding
that gives the default service user this role in <YOUR NAMESPACE>
.
This RBAC example covers only the permissions needed for this PodDeletionCost
extension specifically. However, usually you’ll also be using Kubernetes API for discovery and boostrap of your cluster, so you’ll need to combine this with any other role required already configured, either by keeping them separately or merging them into a single role.
Alternative with Custom Resource Definition
If it’s a security concern in your environment to allow “patch” in RBAC as described above, you can instead use an intermediate Custom Resource Definition (CRD). Instead of updating the controller.kubernetes.io/pod-deletion-cost
annotation directly it will update a PodCost
custom resource and then you would have an operator that reconciles that and updates the pod-deletion-cost annotation of the pod resource.
You would have to write the Kubernetes operator that watches the PodCost
resource and updates the controller.kubernetes.io/pod-deletion-cost
annotation of the corresponding pod resource. This operator is not provided by Akka.
Enable updates of custom resource with configuration:
akka.rollingupdate.kubernetes.custom-resource.enabled = true
The PodCost
CRD:
sourceapiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
metadata:
# name must match the spec fields below, and be in the form: <plural>.<group>
name: podcosts.akka.io
spec:
group: akka.io
versions:
- name: v1
storage: true
served: true
schema:
openAPIV3Schema:
type: object
properties:
spec:
type: object
properties:
version:
type: string
pods:
type: array
items:
type: object
properties:
# the name of the pod that should be updated with the pod-deletion-cost annotation
podName:
type: string
# the value of the controller.kubernetes.io/pod-deletion-cost annotation
cost:
type: integer
# address, uid and time are used for cleanup of removed members
address:
type: string
# address, uid and time are used for cleanup of removed members
uid:
type: integer
# address, uid and time are used for cleanup of removed members
time:
type: integer
scope: Namespaced
names:
# kind is normally the CamelCased singular type. Your resource manifests use this.
kind: PodCost
listKind: PodCostList
# singular name to be used as an alias on the CLI and for display
singular: podcost
# plural name to be used in the URL: /apis/<group>/<version>/<plural>
plural: podcosts
The RBAC for the application to update the PodCost
CR, instead of “patch” of the “pods” resources:
kind: Role
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: podcost-access
rules:
- apiGroups: ["akka.io"]
resources: ["podcosts"]
verbs: ["get", "create", "update", "delete", "list"]
---
kind: RoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: podcost-access
subjects:
- kind: User
name: system:serviceaccount:<YOUR NAMESPACE>:default
roleRef:
kind: Role
name: podcost-access
apiGroup: rbac.authorization.k8s.io
app-version from Deployment
When using Cluster Sharding, it is recommended to define an increasing akka.cluster.app-version
configuration property for each roll out.
This works well unless you use kubectl rollout undo
which deploys the previous ReplicaSet configuration which contains the previous value for that config.
To fix this, you can use AppVersionRevision
to read the current annotation deployment.kubernetes.io/revision
(part of the ReplicaSet) from the Kubernetes Deployment via the Kubernetes api which always increases, also during a rollback.
Using
The AppVersionRevision extension must be started, this can either be done through config or programmatically.
Through config
Listing the AppVersionRevision
extension among the autoloaded akka.extensions
in application.conf
will also cause it to autostart:
akka.extensions = ["akka.rollingupdate.kubernetes.AppVersionRevision"]
If the extension configuration is incorrect, the autostart will log an error and terminate the actor system.
Programmatically
- Scala
-
source
// Starting the AppVersionRevision extension // preferred to be called before ClusterBootstrap AppVersionRevision(system).start()
- Java
-
source
// Starting the AppVersionRevision extension // preferred to be called before ClusterBootstrap AppVersionRevision.get(system).start();
Configuration
The following configuration is required, more details for each and additional configurations can be found in reference.conf:
akka.rollingupdate.kubernetes.pod-name
: this can be provided by settingKUBERNETES_POD_NAME
environment variable tometadata.name
on the Kubernetes container spec.
env:
- name: KUBERNETES_POD_NAME
valueFrom:
fieldRef:
fieldPath: metadata.name
Additionally, the pod annotator needs to know which namespace the pod belongs to. By default, this will be detected by reading the namespace from the service account secret, in /var/run/secrets/kubernetes.io/serviceaccount/namespace
, but can be overridden by setting akka.rollingupdate.kubernetes.namespace
or by providing KUBERNETES_NAMESPACE
environment variable.
env:
- name: KUBERNETES_NAMESPACE
valueFrom:
fieldRef:
fieldPath: metadata.namespace
Role based access control
Make sure to provide access to corresponding rbac rules apiGroups
and resources
like this:
kind: Role
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: pod-reader
rules:
- apiGroups: [""]
resources: ["pods"]
verbs: ["get", "list"]
- apiGroups: ["apps"]
resources: ["replicasets"]
verbs: ["get", "list"]
---
kind: RoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: pod-reader
subjects:
- kind: ServiceAccount
name: default
roleRef:
kind: Role
name: pod-reader
apiGroup: rbac.authorization.k8s.io