Journal plugin
The journal plugin enables storing and loading events for event sourced persistent actors.
Tables
The journal plugin requires an event journal table to be created in DynamoDB. The default table name is event_journal
and this can be configured (see the reference configuration for all settings). The table should be created with the following attributes and key schema:
Attribute name | Attribute type | Key type |
---|---|---|
pid | S (String) | HASH |
seq_nr | N (Number) | RANGE |
Read capacity units should be based on expected entity recoveries. Write capacity units should be based on expected rates for persisting events.
An example aws
CLI command for creating the event journal table:
sourceaws dynamodb create-table \
--table-name event_journal \
--attribute-definitions \
AttributeName=pid,AttributeType=S \
AttributeName=seq_nr,AttributeType=N \
--key-schema \
AttributeName=pid,KeyType=HASH \
AttributeName=seq_nr,KeyType=RANGE \
--provisioned-throughput \
ReadCapacityUnits=5,WriteCapacityUnits=5 \
Indexes
If queries or projections are being used, then a global secondary index needs to be added to the event journal table, to index events by slice. The default name (derived from the configured table name) for the secondary index is event_journal_slice_idx
and may be explicitly set (see the reference configuration). The following attribute definitions should be added to the event journal table, with key schema for the event journal slice index:
Attribute name | Attribute type | Key type |
---|---|---|
entity_type_slice | S (String) | HASH |
ts | N (Number) | RANGE |
Write capacity units for the index should be aligned with the event journal. Read capacity units should be based on expected queries.
An example aws
CLI command for creating the event journal table and slice index:
sourceaws dynamodb create-table \
--table-name event_journal \
--attribute-definitions \
AttributeName=pid,AttributeType=S \
AttributeName=seq_nr,AttributeType=N \
AttributeName=entity_type_slice,AttributeType=S \
AttributeName=ts,AttributeType=N \
--key-schema \
AttributeName=pid,KeyType=HASH \
AttributeName=seq_nr,KeyType=RANGE \
--provisioned-throughput \
ReadCapacityUnits=5,WriteCapacityUnits=5 \
--global-secondary-indexes \
'[
{
"IndexName": "event_journal_slice_idx",
"KeySchema": [
{"AttributeName": "entity_type_slice", "KeyType": "HASH"},
{"AttributeName": "ts", "KeyType": "RANGE"}
],
"Projection": {
"ProjectionType": "ALL"
},
"ProvisionedThroughput": {
"ReadCapacityUnits": 5,
"WriteCapacityUnits": 5
}
}
]'
Creating tables locally
For creating tables with DynamoDB local for testing, see the CreateTables utility.
Configuration
To enable the journal plugin to be used by default, add the following line to your Akka application.conf
:
akka.persistence.journal.plugin = "akka.persistence.dynamodb.journal"
It can also be enabled with the journalPluginId
for a specific EventSourcedBehavior
and multiple plugin configurations are supported.
Reference configuration
The following can be overridden in your application.conf
for the journal specific settings:
sourceakka.persistence.dynamodb {
journal {
class = "akka.persistence.dynamodb.journal.DynamoDBJournal"
# name of the table to use for events
table = "event_journal"
# Name of global secondary index to support queries and/or projections.
# "" is the default and denotes an index named "${table}_slice_idx"
# (viz. when table (see above) is "event_journal", the GSI will be
# "event_journal_slice_idx"). If for some reason an alternative GSI name
# is required, set that GSI name explicitly here; if set explicitly, this
# name will be used unmodified
by-slice-idx = ""
# Set this to off to disable publishing of events as Akka messages to running
# eventsBySlices queries.
# Tradeoff is more CPU and network resources that are used. The events
# must still be retrieved from the database, but at a lower polling frequency,
# because delivery of published messages are not guaranteed.
# When this feature is enabled it will measure the throughput and automatically
# disable/enable if the throughput exceeds the configured threshold. See
# publish-events-dynamic configuration.
publish-events = on
# When publish-events is enabled it will measure the throughput and automatically
# disable/enable if the throughput exceeds the configured threshold.
# This configuration cannot be defined per journal, but is global for the ActorSystem.
publish-events-dynamic {
# If exponentially weighted moving average of measured throughput exceeds this
# threshold publishing of events is disabled. It is enabled again when lower than
# the threshold.
throughput-threshold = 400
# The interval of the throughput measurements.
throughput-collect-interval = 10 seconds
}
# Group the slices for an entity type into this number of topics. Most efficient is to use
# the same number as number of projection instances. If configured to less than the number of
# projection instances the overhead is that events will be sent more than once and discarded
# on the destination side. If configured to more than the number of projection instances
# the events will only be sent once but there is a risk of exceeding the limits of number
# of topics that PubSub can handle (e.g. OversizedPayloadException).
# Must be between 1 and 1024 and a whole number divisor of 1024 (number of slices).
# This configuration can be changed in a rolling update, but there might be some events
# that are not delivered via the pub-sub path and instead delivered later by the queries.
# This configuration cannot be defined per journal, but is global for the ActorSystem.
publish-events-number-of-topics = 128
# replay filter not needed for this plugin
replay-filter.mode = off
}
}
Event serialization
The events are serialized with Akka Serialization and the binary representation is stored in the event_payload
column together with information about what serializer that was used in the event_ser_id
and event_ser_manifest
columns.
Retryable errors
When persisting events, any DynamoDB errors that are considered retryable, such as when provisioned throughput capacity is exceeded, will cause events to be rejected rather than marked as a journal failure. A supervision strategy for EventRejectedException
failures can then be added to EventSourcedBehaviors, so that entities can be resumed on these retryable errors rather than stopped or restarted.
Deletes
The journal supports deletes through hard deletes, which means that journal entries are actually deleted from the database. There is no materialized view with a copy of the event, so make sure to not delete events too early if they are used from projections or queries. A projection can also start or continue from a snapshot, and then events can be deleted before the snapshot.
For each persistent id, a tombstone record is kept in the event journal when all events of a persistence id have been deleted. The reason for the tombstone record is to keep track of the latest sequence number so that subsequent events don’t reuse the same sequence numbers that have been deleted.
See the EventSourcedCleanup tool for more information about how to delete events, snapshots, and tombstone records.
Time to Live (TTL)
Rather than deleting items immediately, an expiration timestamp can be set on events or snapshots. DynamoDB’s Time to Live (TTL) feature can then be enabled, to automatically delete items after they have expired.
The TTL attribute to use for the journal or snapshot tables is named expiry
.
Time-to-live settings are configured per entity type. The entity type can also be matched by prefix by using a *
at the end of the key.
If events are being deleted on snapshot, the journal can be configured to instead set an expiry time for the deleted events, given a time-to-live duration to use. For example, deleted events can be configured to expire in 7 days, rather than being deleted immediately, for a particular entity type:
sourceakka.persistence.dynamodb.time-to-live {
event-sourced-entities {
"some-entity-type" {
use-time-to-live-for-deletes = 7 days
}
}
}
While it is recommended to keep all events in an event sourced system, so that new projections can be re-built, setting a time to live expiry on events or snapshots when they are created and stored is supported. For example, events can be configured to expire in 3 days and snapshots in 5 days, for all entity type names that start with a particular prefix:
sourceakka.persistence.dynamodb.time-to-live {
event-sourced-entities {
"entity-type-*" {
event-time-to-live = 3 days
snapshot-time-to-live = 5 days
}
}
}
The EventSourcedCleanup tool can also be used to set an expiration timestamp on events or snapshots.
An expiry marker is kept in the event journal when all events for a persistence id have been marked for expiration, in the same way that a tombstone record is used for hard deletes. This expiry marker keeps track of the latest sequence number so that subsequent events don’t reuse the same sequence numbers for events that have expired.
In case persistence ids will be reused with possibly expired events or snapshots, there is a check-expiry
feature enabled by default, where expired events or snapshots are treated as already deleted when replaying from the journal. This enforces expiration before DynamoDB Time to Live may have actually deleted the data, and protects against partially deleted data.
Time to Live reference configuration
The following can be overridden in your application.conf
for the time-to-live specific settings:
sourceakka.persistence.dynamodb {
# Time to Live (TTL) settings
time-to-live {
event-sourced-defaults {
# Whether to check the expiry of events or snapshots and treat as already deleted when replaying.
# This enforces expiration before DynamoDB Time to Live may have actually deleted the data.
check-expiry = on
# Set a time-to-live duration in place of deletes when events or snapshots are deleted by an entity
# (such as when events are deleted on snapshot). Set to a duration to expire items after this time
# following the triggered deletion. Disabled when set to `off` or `none`.
use-time-to-live-for-deletes = off
# Set a time-to-live duration on all events when they are originally created and stored.
# Disabled when set to `off` or `none`.
event-time-to-live = off
# Set a time-to-live duration on all snapshots when they are originally created and stored.
# Disabled when set to `off` or `none`.
snapshot-time-to-live = off
}
# Time-to-live settings per entity type for event sourced entities.
# See `event-sourced-defaults` for possible settings and default values.
# Prefix matching is supported by using * at the end of an entity type key.
event-sourced-entities {
# Example configuration:
# "some-entity-type" {
# use-time-to-live-for-deletes = 7 days
# }
# "entity-type-*" {
# event-time-to-live = 3 days
# snapshot-time-to-live = 5 days
# }
}
}
}