Consumer

A consumer subscribes to Kafka topics and passes the messages into an Akka Stream.

The underlying implementation is using the KafkaConsumer, see Kafka API for a description of consumer groups, offsets, and other details.

Choosing a consumer

Alpakka Kafka offers a large variety of consumers that connect to Kafka and stream data. The tables below may help you to find the consumer best suited for your use-case.

Consumers

These factory methods are part of the ConsumerConsumer API.

Offsets handling	Partition aware	Subscription	Shared consumer	Factory method	Stream element type
No (auto commit can be enabled)	No	Topic or Partition	No	`plainSource`	`ConsumerRecord`
No (auto commit can be enabled)	No	Partition	Yes	`plainExternalSource`	`ConsumerRecord`
Explicit committing	No	Topic or Partition	No	`committableSource`	`CommittableMessage`
Explicit committing	No	Partition	Yes	`committableExternalSource`	`CommittableMessage`
Explicit committing with metadata	No	Topic or Partition	No	`commitWithMetadataSource`	`CommittableMessage`
Explicit committing (with metadata)	No	Topic or Partition	No	`sourceWithOffsetContext`	`ConsumerRecord`
Offset committed per element	No	Topic or Partition	No	`atMostOnceSource`	`ConsumerRecord`
No (auto commit can be enabled)	Yes	Topic or Partition	No	`plainPartitionedSource`	`(TopicPartition, Source[ConsumerRecord, ..])`
External to Kafka	Yes	Topic or Partition	No	`plainPartitionedManualOffsetSource`	`(TopicPartition, Source[ConsumerRecord, ..])`
Explicit committing	Yes	Topic or Partition	No	`committablePartitionedSource`	`(TopicPartition, Source[CommittableMessage, ..])`
External to Kafka & Explicit Committing	Yes	Topic or Partition	No	`committablePartitionedManualOffsetSource`	`(TopicPartition, Source[CommittableMessage, ..])`
Explicit committing with metadata	Yes	Topic or Partition	No	`commitWithMetadataPartitionedSource`	`(TopicPartition, Source[CommittableMessage, ..])`

Transactional consumers

These factory methods are part of the TransactionalTransactional. For details see Transactions.

Offsets handling	Partition aware	Shared consumer	Factory method	Stream element type
Transactional	No	No	`Transactional.source`	`TransactionalMessage`
Transactional	No	No	`Transactional.sourceWithOffsetContext`	`ConsumerRecord`

Settings

When creating a consumer source you need to pass in ConsumerSettingsConsumerSettings that define things like:

de-serializers for the keys and values
bootstrap servers of the Kafka cluster (see Service discovery to defer the server configuration)
group id for the consumer, note that offsets are always committed for a given consumer group
Kafka consumer tuning parameters

Alpakka Kafka’s defaults for all settings are defined in reference.conf which is included in the library JAR.

Important consumer settings

Setting	Description
stop-timeout	The stage will delay stopping the internal actor to allow processing of messages already in the stream (required for successful committing). This can be set to 0 for streams using `Consumer.DrainingControlConsumer.DrainingControl`
kafka-clients	Section for properties passed unchanged to the Kafka client (see Kafka’s Consumer Configs)
connection-checker	Configuration to let the stream fail if the connection to the Kafka broker fails.

reference.conf (HOCON)

source# Properties for akka.kafka.ConsumerSettings can be
# defined in this section or a configuration section with
# the same layout.
akka.kafka.consumer {
  # Config path of Akka Discovery method
  # "akka.discovery" to use the Akka Discovery method configured for the ActorSystem
  discovery-method = akka.discovery

  # Set a service name for use with Akka Discovery
  # https://doc.akka.io/docs/alpakka-kafka/current/discovery.html
  service-name = ""

  # Timeout for getting a reply from the discovery-method lookup
  resolve-timeout = 3 seconds

  # Tuning property of scheduled polls.
  # Controls the interval from one scheduled poll to the next.
  poll-interval = 50ms

  # Tuning property of the `KafkaConsumer.poll` parameter.
  # Note that non-zero value means that the thread that
  # is executing the stage will be blocked. See also the `wakup-timeout` setting below.
  poll-timeout = 50ms

  # The stage will delay stopping the internal actor to allow processing of
  # messages already in the stream (required for successful committing).
  # This can be set to 0 for streams using `DrainingControl`.
  stop-timeout = 30s

  # Duration to wait for `KafkaConsumer.close` to finish.
  close-timeout = 20s

  # If offset commit requests are not completed within this timeout
  # the returned Future is completed `CommitTimeoutException`.
  # The `Transactional.source` waits this ammount of time for the producer to mark messages as not
  # being in flight anymore as well as waiting for messages to drain, when rebalance is triggered.
  commit-timeout = 15s

  # If commits take longer than this time a warning is logged
  commit-time-warning = 1s

  # Not relevant for Kafka after version 2.1.0.
  # If set to a finite duration, the consumer will re-send the last committed offsets periodically
  # for all assigned partitions. See https://issues.apache.org/jira/browse/KAFKA-4682.
  commit-refresh-interval = infinite

  # Fully qualified config path which holds the dispatcher configuration
  # to be used by the KafkaConsumerActor. Some blocking may occur.
  use-dispatcher = "akka.kafka.default-dispatcher"

  # Properties defined by org.apache.kafka.clients.consumer.ConsumerConfig
  # can be defined in this configuration section.
  kafka-clients {
    # Disable auto-commit by default
    enable.auto.commit = false
  }

  # Time to wait for pending requests when a partition is closed
  wait-close-partition = 500ms

  # Limits the query to Kafka for a topic's position
  position-timeout = 5s

  # When using `AssignmentOffsetsForTimes` subscriptions: timeout for the
  # call to Kafka's API
  offset-for-times-timeout = 5s

  # Timeout for akka.kafka.Metadata requests
  # This value is used instead of Kafka's default from `default.api.timeout.ms`
  # which is 1 minute.
  metadata-request-timeout = 5s

  # Interval for checking that transaction was completed before closing the consumer.
  # Used in the transactional flow for exactly-once-semantics processing.
  eos-draining-check-interval = 30ms

  # Issue warnings when a call to a partition assignment handler method takes
  # longer than this.
  partition-handler-warning = 5s

  # Settings for checking the connection to the Kafka broker. Connection checking uses `listTopics` requests with the timeout
  # configured by `consumer.metadata-request-timeout`
  connection-checker {

    #Flag to turn on connection checker
    enable = false

    # Amount of attempts to be performed after a first connection failure occurs
    # Required, non-negative integer
    max-retries = 3

    # Interval for the connection check. Used as the base for exponential retry.
    check-interval = 15s

    # Check interval multiplier for backoff interval
    # Required, positive number
    backoff-factor = 2.0
  }

  # Protect against server-side bugs that cause Kafka to temporarily "lose" the latest offset for a consumer, which
  # then causes the Kafka consumer to follow its normal 'auto.offset.reset' behavior. For 'earliest', these settings
  # allow the client to detect and attempt to recover from this issue. For 'none' and 'latest', these settings will
  # only add overhead. See
  # for more information
  offset-reset-protection {
    # turns on reset protection
    enable = false
    # if consumer gets a record with an offset that is more than this number of offsets back from the previously
    # requested offset, it is considered a reset
    offset-threshold = 9223372036854775807
    # if the record is more than this duration earlier the last received record, it is considered a reset
    time-threshold = 100000 days
  }
}

The Kafka documentation Consumer Configs lists the settings, their defaults and importance. More detailed explanations are given in the KafkaConsumer API and constants are defined in ConsumerConfig API.

Programmatic construction

Stream-specific settings like the de-serializers and consumer group ID should be set programmatically. Settings that apply to many consumers may be set in application.conf or use config inheritance.

Setting	Description	Default Value
maxBatch	maximum number of messages to commit at once	1000
maxInterval	maximum interval between commits	10 seconds
parallelism	maximum number of commit batches in flight	100

Factory method	Stream element type	Emits
`sink`	`Committable`	N/A
`sinkWithOffsetContext`	Any (`CommittableOffset` in context)	N/A
`flow`	`Committable`	`Done`
`batchFlow`	`Committable`	`CommittableOffsetBatch`
`flowWithOffsetContext`	Any (`CommittableOffset` in context)	`NotUsed` (`CommittableOffsetBatch` in context)

Consumer

Choosing a consumer

Consumers

Transactional consumers

Settings

Programmatic construction

Config inheritance

Offset Storage external to Kafka

Offset Storage in Kafka - committing

Committer sink

Committer variants

Commit with meta-data

Offset Storage in Kafka & external

Consume “at-most-once”

Consume “at-least-once”

Connecting Producer and Consumer

Source per partition

Sharing the KafkaConsumer instance

Accessing KafkaConsumer metrics

Accessing KafkaConsumer metadata

Controlled shutdown

Draining control