Regions
Projects in Akka can span across regions with data automatically replicated between all the regions. This increases availability as the regions can either be separate cloud / geographic regions or can be separate logical regions within the same cloud / geographic region. This gives you a high level of control for managing failure domains or fault boundaries in your applications. This is sometimes referred to as blast radius control.
Regions are specified in the project configuration. All services in the project are deployed to all regions. One of the regions will be specified as the primary region. The primary region indicates the source from which region resources (services, routes, secrets, etc.) should be replicated from. By default the primary region is the first one added to the project at deployment time.
Additionally, the primary region also indicates where primary data copies should reside in stateful components like Event Sourced Entities or Workflow when using the static mode.
Regions appear at two different scopes in Akka. The first is at the Organizations scope. This conveys which regions are available to your organization. The second is at the project scope, which conveys which regions a specific project is bound to. |
To see what regions have been configured for your project, you can run:
akka project regions list
Adding a region to a project
A region can be added to a project if the organization that owns the project has access to that region. To see which regions your organization has access to, run the akka regions list
command:
akka regions list --organization my-organization
To add one of these regions to your project, run:
akka project regions add gcp-us-east1
When you deploy a service it will run in all regions of the project. When you add a region to a project the existing services will automatically start in the new region.
Selecting primary for stateful components
Stateful components like Event Sourced Entities or Workflow can be replicated to other regions. For each stateful component instance there is a primary region, which handles all write requests. Read requests can be served from any region. See Event Sourced Entity replication and Workflow replication for more information about read and write requests.
There are two operational choices for deciding where the primary is located:
-
Static mode - one region is defined as the primary for the project, and all stateful component instances will use that region as primary
-
Dynamic mode - the primary is selected by each individual component instance based on where the first event is persisted
Before changing the primary selection mode, make sure that you understand and follow the steps described in the How to. |
The static mode is used by default. To use dynamic mode you need to deploy the service with a service descriptor:
name: my-service
spec:
image: my-container-uri/container-name:tag-name
replication:
mode: replicated-read
replicatedRead:
primarySelectionMode: dynamic
When using dynamic mode, the application must ensure that a specific stateful component instance is not created concurrently in more than one region. Although data will still be replicated if this occurs, there are no guarantees regarding the order of these initial changes, which could lead to conflicting states across regions. Once the initial updates are replicated, a single primary instance is selected.
It is possible to switch between static and dynamic mode, but this should only be done with careful consideration of the consequences. For example, when changing the primary, not all updates may have been replicated and the new primary may not be fully up to date. This is why there is a third mode. This is a read-only mode for all regions, which causes all write requests to be rejected. This can be used as an intermediate stage to ensure that all updates have been replicated before the primary is changed.
To use this read-only mode for all regions you set primarySelectionMode
to none
in the service descriptor:
name: my-service
spec:
image: my-container-uri/container-name:tag-name
replication:
mode: replicated-read
replicatedRead:
primarySelectionMode: none
To use the static primary selection mode again you set static
in the service descriptor:
name: my-service
spec:
image: my-container-uri/container-name:tag-name
replication:
mode: replicated-read
replicatedRead:
primarySelectionMode: static
Setting the primary region of a project
Changing the primary region of a project is how you control failover or migration in Akka.
The primary region of a project is also the region that will be used as primary for stateful components in the static selection mode. Changing primary should only be done with careful consideration of the consequences, and it is recommended to first change to the read-only mode in all regions. See Selecting primary for stateful components. |
Key Value Entities are currently not replicated across regions. The data of all Key Value Entities exists only in the primary region. All requests to Key Value Entities in other regions are forwarded to the primary region. This means that if Key Value Entities are used in a multi-region project the primary region should not be changed, since the data will not exist in the new region. Full replication of Key Value Entities is coming soon. |
Before changing the primary region, make sure that you understand and follow the steps described in the How to. |
To change the primary region of a project run:
akka project regions set-primary gcp-us-east1
It may be necessary to clear the region cache when running the akka command on other machines before this change will be picked up. This can be done by running akka config clear-cache .
|
Managing resources in a multi region project
Akka projects are built to span regions. To accomplish this, Akka considers resources in two ways.
Global resources
In an Akka project, services, routes, secrets, and observability configuration are all global resources in that they will deploy to all regions that the project is bound to.
The underlying replication mechanism is that when resources are deployed they first deploy to the primary region. Then a background process will asynchronously copy them to the remain regions. This background synchronization process is eventually consistent.
The list
and get
commands for multi-region resources display the sync status for global resources. These commands will show the resource in the primary region by default. You can specify which region you want to get the resource from by passing the --region
flag. If you want to view the resource in all regions, you can pass the --all-regions
flag.
Regional resources
There are certain circumstances where it may not be appropriate to have the same resource synced to all regions. Some common reasons are as follows:
-
A route may need to be served from a different hostname in each region.
-
A service may require different credentials for a third party service for each region, requiring a different secret to be deployed to each region.
-
A different observability configuration may be needed in different regions, such that metrics and logs are aggregated locally in the region.
To deploy a resource as a regional resource, you can specify a --region
flag to specify which region you want to create the resource in. When updating or deleting the resource, the --region
flag needs to be passed.
Switching between global and regional resources
If you have a global resource that you want to change to being a regional resource, this can be done by updating the resource, passing a --region
flag, and passing the --force-regional
flag to change it from a global to a regional resource. You must do this on the primary region first, otherwise the resource synchronization process may overwrite your changes.
If you have a regional resource that you want to change to being a global resource, this can be done by updating the resource without specifying a --region
flag, but passing the --force-global
flag instead. The command will perform the update in the primary region, and that configuration will be replicated to, and overwrite, the configuration in the rest of the regions.
How to
There can be several reasons for changing multi-region resources and the primary of stateful components. In this section we describe a few scenarios and provide a checklist of the recommended procedure.
Observe replication status
You can see throughput, lag, and errors in the replication section in the Control Tower. The replication lag is the time from when the events were created until they were received in the other region. Some errors may be normal, since the connections are sometimes restarted.
Add a region
-
Follow the instructions in Adding a region to a project.
-
You have to deploy the services again because the container images don’t exist in the container registry of the new region, unless you use a global container registry.
-
You need to expose the services in the new region.
-
Stateful components are automatically replicated to the new region. This may take some time, and you can see progress in the replication section in the Control Tower. The event consumption lag will at first be high and then close to zero when the replication has been completed.
Switch from static to dynamic primary selection mode for stateful components
The default primary selection mode for stateful components is the static mode, as explained in Selecting primary for stateful components, and you might want to change that to dynamic after the first deployment. That section also describes how you change the primary selection mode with a service descriptor.
Component instances that have already been created will continue to have their primary in the original static primary region. New component instances will have their primary in the region they are first written in.
-
First, change to the
none
primary selection mode. This is a read-only mode for all regions and all write requests will be rejected. The reason for changing to this intermediate mode is to make sure that all events have been replicated without creating new events. -
Wait until the deployment of the
none
primary selection mode has been successfully propagated to all regions. Observe in the Akka Console that the rolling update has been completed in all regions. You can also make sure that replicated events reach zero in the replication section in the Control Tower. -
Change to
dynamic
primary selection mode.
Switch from dynamic to static primary selection mode for stateful components
Static mode takes precedence over dynamic in the sense that a component instance will change its primary to the static region when there is a new write request to the component instance, and it persists a new event.
Selecting primary for stateful components describes how you change the primary selection mode with a service descriptor.
-
First, change to the
none
primary selection mode. This is a read-only mode for all regions and all write requests will be rejected. The reason for changing to this intermediate mode is to make sure that all events have been replicated without creating new events. -
Wait until the deployment of the
none
primary selection mode has been successfully propagated to all regions. Observe in the Akka Console that the rolling update has been completed in all regions. You can also make sure that replicated events reach zero in the replication section in the Control Tower. -
Change to
static
primary selection mode.
Change the static primary region for stateful components
You might want to change the static primary for stateful components if you migrate from one region to another, or need to bring down the primary region for maintenance for a while.
Selecting primary for stateful components describes how you change the primary selection mode with a service descriptor.
-
First, change to the
none
primary selection mode. This is a read-only mode for all regions and all write requests will be rejected. The reason for changing to this intermediate mode is to make sure that all events have been replicated without creating new events. -
Wait until the deployment of the
none
primary selection mode has been successfully propagated to all regions. Observe in the Akka Console that the rolling update has been completed in all regions. You can also make sure that replicated events reach zero in the replication section in the Control Tower. -
Follow instructions in Setting the primary region of a project.
-
Change to
static
primary selection mode.
Change primary region for disaster recovery
If a region is failing you might want to fail over to another region that is working.
-
If the failing region is the primary region, follow instructions in Setting the primary region of a project and change the primary to a non-failing region.
-
If you are using
dynamic
primary selection you should Switch from dynamic to static primary selection mode for stateful components. Depending on how responsive the failing region is this might not be possible to deploy to the failing region, but you should deploy it to the non-failing regions. The reason for this is that otherwise write requests will still be routed to the failing region for component instances that have their primary in the failing region. -
Be aware that events that were written in the failing region and had not been replicated to other regions before the hard failover will be replicated when the regions are connected again. There are no guarantees regarding the order of these "old" events and any new events written by the new primary, which could lead to conflicting states across regions.