I/O
Loading

I/O

Introduction

The akka.io package has been developed in collaboration between the Akka and spray.io teams. Its design combines experiences from the spray-io module with improvements that were jointly developed for more general consumption as an actor-based service.

The guiding design goal for this I/O implementation was to reach extreme scalability, make no compromises in providing an API correctly matching the underlying transport mechanism and to be fully event-driven, non-blocking and asynchronous. The API is meant to be a solid foundation for the implementation of network protocols and building higher abstractions; it is not meant to be a full-service high-level NIO wrapper for end users.

Note

The old I/O implementation has been deprecated and its documentation has been moved: Old IO

Terminology, Concepts

The I/O API is completely actor based, meaning that all operations are implemented with message passing instead of direct method calls. Every I/O driver (TCP, UDP) has a special actor, called a manager that serves as an entry point for the API. I/O is broken into several drivers. The manager for a particular driver is accessible through the IO entry point. For example the following code looks up the TCP manager and returns its ActorRef:

import akka.io.{ IO, Tcp }
import context.system // implicitly used by IO(Tcp)

val manager = IO(Tcp)

The manager receives I/O command messages and instantiates worker actors in response. The worker actors present themselves to the API user in the reply to the command that was sent. For example after a Connect command sent to the TCP manager the manager creates an actor representing the TCP connection. All operations related to the given TCP connections can be invoked by sending messages to the connection actor which announces itself by sending a Connected message.

DeathWatch and Resource Management

I/O worker actors receive commands and also send out events. They usually need a user-side counterpart actor listening for these events (such events could be inbound connections, incoming bytes or acknowledgements for writes). These worker actors watch their listener counterparts. If the listener stops then the worker will automatically release any resources that it holds. This design makes the API more robust against resource leaks.

Thanks to the completely actor based approach of the I/O API the opposite direction works as well: a user actor responsible for handling a connection can watch the connection actor to be notified if it unexpectedly terminates.

Write models (Ack, Nack)

I/O devices have a maximum throughput which limits the frequency and size of writes. When an application tries to push more data than a device can handle, the driver has to buffer bytes until the device is able to write them. With buffering it is possible to handle short bursts of intensive writes --- but no buffer is infinite. "Flow control" is needed to avoid overwhelming device buffers.

Akka supports two types of flow control:

  • Ack-based, where the driver notifies the writer when writes have succeeded.
  • Nack-based, where the driver notifies the writer when writes have failed.

Each of these models is available in both the TCP and the UDP implementations of Akka I/O.

Individual writes can be acknowledged by providing an ack object in the write message (Write in the case of TCP and Send for UDP). When the write is complete the worker will send the ack object to the writing actor. This can be used to implement ack-based flow control; sending new data only when old data has been acknowledged.

If a write (or any other command) fails, the driver notifies the actor that sent the command with a special message (CommandFailed in the case of UDP and TCP). This message will also notify the writer of a failed write, serving as a nack for that write. Please note, that in a nack-based flow-control setting the writer has to be prepared for the fact that the failed write might not be the most recent write it sent. For example, the failure notification for a write W1 might arrive after additional write commands W2 and W3 have been sent. If the writer wants to resend any nacked messages it may need to keep a buffer of pending messages.

Warning

An acknowledged write does not mean acknowledged delivery or storage; receiving an ack for a write simply signals that the I/O driver has successfully processed the write. The Ack/Nack protocol described here is a means of flow control not error handling. In other words, data may still be lost, even if every write is acknowledged.

ByteString

To maintain isolation, actors should communicate with immutable objects only. ByteString is an immutable container for bytes. It is used by Akka's I/O system as an efficient, immutable alternative the traditional byte containers used for I/O on the JVM, such as Array[Byte] and ByteBuffer.

ByteString is a rope-like data structure that is immutable and provides fast concatenation and slicing operations (perfect for I/O). When two ByteStrings are concatenated together they are both stored within the resulting ByteString instead of copying both to a new Array. Operations such as drop and take return ByteStrings that still reference the original Array, but just change the offset and length that is visible. Great care has also been taken to make sure that the internal Array cannot be modified. Whenever a potentially unsafe Array is used to create a new ByteString a defensive copy is created. If you require a ByteString that only blocks as much memory as necessary for it's content, use the compact method to get a CompactByteString instance. If the ByteString represented only a slice of the original array, this will result in copying all bytes in that slice.

ByteString inherits all methods from IndexedSeq, and it also has some new ones. For more information, look up the akka.util.ByteString class and it's companion object in the ScalaDoc.

ByteString also comes with its own optimized builder and iterator classes ByteStringBuilder and ByteIterator which provide extra features in addition to those of normal builders and iterators.

Compatibility with java.io

A ByteStringBuilder can be wrapped in a java.io.OutputStream via the asOutputStream method. Likewise, ByteIterator can be wrapped in a java.io.InputStream via asInputStream. Using these, akka.io applications can integrate legacy code based on java.io streams.

Encoding and decoding binary data

Note

Previously Akka offered a specialized Iteratee implementation in the akka.actor.IO object which is now deprecated in favor of the pipeline mechanism described here. The documentation for Iteratees can be found here.

Akka adopted and adapted the implementation of data processing pipelines found in the spray-io module. The idea is that encoding and decoding often go hand in hand and keeping the code pertaining to one protocol layer together is deemed more important than writing down the complete read side—say—in the iteratee style in one go; pipelines encourage packaging the stages in a form which lends itself better to reuse in a protocol stack. Another reason for choosing this abstraction is that it is at times necessary to change the behavior of encoding and decoding within a stage based on a message stream’s state, and pipeline stages allow communication between the read and write halves quite naturally.

The actual byte-fiddling can be done within pipeline stages, for example using the rich API of ByteIterator and ByteStringBuilder as shown below. All these activities are synchronous transformations which benefit greatly from CPU affinity to make good use of those data caches. Therefore the design of the pipeline infrastructure is completely synchronous, every stage’s handler code can only directly return the events and/or commands resulting from an input, there are no callbacks. Exceptions thrown within a pipeline stage will abort processing of the whole pipeline under the assumption that recoverable error conditions will be signaled in-band to the next stage instead of raising an exception.

An overall “logical” pipeline can span multiple execution contexts, for example starting with the low-level protocol layers directly within an actor handling the reads and writes to a TCP connection and then being passed to a number of higher-level actors which do the costly application level processing. This is supported by feeding the generated events into a sink which sends them to another actor, and that other actor will then upon reception feed them into its own pipeline.

Introducing the Sample Protocol

In the following the process of implementing a protocol stack using pipelines is demonstrated on the following simple example:

frameLen: Int
persons: Int
persons times {
  first: String
  last: String
}
points: Int
points times Double

mapping to the following data type:

case class Person(first: String, last: String)
case class HappinessCurve(points: IndexedSeq[Double])
case class Message(persons: Seq[Person], stats: HappinessCurve)

We will split the handling of this protocol into two parts: the frame-length encoding handles the buffering necessary on the read side and the actual encoding of the frame contents is done in a separate stage.

Building a Pipeline Stage

As a common example, which is also included in the akka-actor package, let us look at a framing protocol which works by prepending a length field to each message.

/**
 * Pipeline stage for length-field encoded framing. It will prepend a
 * four-byte length header to the message; the header contains the length of
 * the resulting frame including header in big-endian representation.
 *
 * The `maxSize` argument is used to protect the communication channel sanity:
 * larger frames will not be sent (silently dropped) or received (in which case
 * stream decoding would be broken, hence throwing an IllegalArgumentException).
 */
class LengthFieldFrame(maxSize: Int,
                       byteOrder: ByteOrder = ByteOrder.BIG_ENDIAN,
                       headerSize: Int = 4,
                       lengthIncludesHeader: Boolean = true)
  extends SymmetricPipelineStage[PipelineContext, ByteString, ByteString] {

  // range checks omitted ...

  override def apply(ctx: PipelineContext) =
    new SymmetricPipePair[ByteString, ByteString] {
      var buffer = None: Option[ByteString]
      implicit val byteOrder = LengthFieldFrame.this.byteOrder

      /**
       * Extract as many complete frames as possible from the given ByteString
       * and return the remainder together with the extracted frames in reverse
       * order.
       */
      @tailrec
      def extractFrames(bs: ByteString, acc: List[ByteString]) //
      : (Option[ByteString], Seq[ByteString]) = {
        if (bs.isEmpty) {
          (None, acc)
        } else if (bs.length < headerSize) {
          (Some(bs.compact), acc)
        } else {
          val length = bs.iterator.getLongPart(headerSize).toInt
          if (length < 0 || length > maxSize)
            throw new IllegalArgumentException(
              s"received too large frame of size $length (max = $maxSize)")
          val total = if (lengthIncludesHeader) length else length + headerSize
          if (bs.length >= total) {
            extractFrames(bs drop total, bs.slice(headerSize, total) :: acc)
          } else {
            (Some(bs.compact), acc)
          }
        }
      }

      /*
       * This is how commands (writes) are transformed: calculate length
       * including header, write that to a ByteStringBuilder and append the
       * payload data. The result is a single command (i.e. `Right(...)`).
       */
      override def commandPipeline =
        { bs: ByteString 
          val length =
            if (lengthIncludesHeader) bs.length + headerSize else bs.length
          if (length > maxSize) Seq()
          else {
            val bb = ByteString.newBuilder
            bb.putLongPart(length, headerSize)
            bb ++= bs
            ctx.singleCommand(bb.result)
          }
        }

      /*
       * This is how events (reads) are transformed: append the received
       * ByteString to the buffer (if any) and extract the frames from the
       * result. In the end store the new buffer contents and return the
       * list of events (i.e. `Left(...)`).
       */
      override def eventPipeline =
        { bs: ByteString 
          val data = if (buffer.isEmpty) bs else buffer.get ++ bs
          val (nb, frames) = extractFrames(data, Nil)
          buffer = nb
          /*
           * please note the specialized (optimized) facility for emitting
           * just a single event
           */
          frames match {
            case Nil         Nil
            case one :: Nil  ctx.singleEvent(one)
            case many        many reverseMap (Left(_))
          }
        }
    }
}

In the end a pipeline stage is nothing more than a set of three functions: one transforming commands arriving from above, one transforming events arriving from below and the third transforming incoming management commands (not shown here, see below for more information). The result of the transformation can in either case be a sequence of commands flowing downwards or events flowing upwards (or a combination thereof).

In the case above the data type for commands and events are equal as both functions operate only on ByteString, and the transformation does not change that type because it only adds or removes four octets at the front.

The pair of command and event transformation functions is represented by an object of type PipePair, or in this case a SymmetricPipePair. This object could benefit from knowledge about the context it is running in, for example an Actor, and this context is introduced by making a PipelineStage be a factory for producing a PipePair. The factory method is called apply (in good Scala tradition) and receives the context object as its argument. The implementation of this factory method could now make use of the context in whatever way it sees fit, you will see an example further down.

Manipulating ByteStrings

The second stage of our sample protocol stack illustrates in more depth what showed only a little in the pipeline stage built above: constructing and deconstructing byte strings. Let us first take a look at the encoder:

/**
 * This trait is used to formualate a requirement for the pipeline context.
 * In this example it is used to configure the byte order to be used.
 */
trait HasByteOrder extends PipelineContext {
  def byteOrder: java.nio.ByteOrder
}

class MessageStage extends SymmetricPipelineStage[HasByteOrder, Message, ByteString] {

  override def apply(ctx: HasByteOrder) = new SymmetricPipePair[Message, ByteString] {

    implicit val byteOrder = ctx.byteOrder

    /**
     * Append a length-prefixed UTF-8 encoded string to the ByteStringBuilder.
     */
    def putString(builder: ByteStringBuilder, str: String): Unit = {
      val bs = ByteString(str, "UTF-8")
      builder putInt bs.length
      builder ++= bs
    }

    override val commandPipeline = { msg: Message 
      val bs = ByteString.newBuilder

      // first store the persons
      bs putInt msg.persons.size
      msg.persons foreach { p 
        putString(bs, p.first)
        putString(bs, p.last)
      }

      // then store the doubles
      bs putInt msg.stats.points.length
      bs putDoubles (msg.stats.points.toArray)

      // and return the result as a command
      ctx.singleCommand(bs.result)
    }

    // decoding omitted ...
  }
}

Note how the byte order to be used by this stage is fixed in exactly one place, making it impossible get wrong between commands and events; the way how the byte order is passed into the stage demonstrates one possible use for the stage’s context parameter.

The basic tool for constucting a ByteString is a ByteStringBuilder which can be obtained by calling ByteString.newBuilder since byte strings implement the IndexesSeq[Byte] interface of the standard Scala collections. This builder knows a few extra tricks, though, for appending byte representations of the primitive data types like Int and Double or arrays thereof. Encoding a String requires a bit more work because not only the sequence of bytes needs to be encoded but also the length, otherwise the decoding stage would not know where the String terminates. When all values making up the Message have been appended to the builder, we simply pass the resulting ByteString on to the next stage as a command using the optimized singleCommand facility.

Warning

The singleCommand and singleEvent methods provide a way to generate responses which transfer exactly one result from one pipeline stage to the next without suffering the overhead of object allocations. This means that the returned collection object will not work for anything else (you will get ClassCastExceptions!) and this facility can only be used EXACTLY ONCE during the processing of one input (command or event).

Now let us look at the decoder side:

def getString(iter: ByteIterator): String = {
  val length = iter.getInt
  val bytes = new Array[Byte](length)
  iter getBytes bytes
  ByteString(bytes).utf8String
}

override val eventPipeline = { bs: ByteString 
  val iter = bs.iterator

  val personLength = iter.getInt
  val persons =
    (1 to personLength) map (_  Person(getString(iter), getString(iter)))

  val curveLength = iter.getInt
  val curve = new Array[Double](curveLength)
  iter getDoubles curve

  // verify that this was all; could be left out to allow future extensions
  assert(iter.isEmpty)

  ctx.singleEvent(Message(persons, HappinessCurve(curve)))
}

The decoding side does the same things that the encoder does in the same order, it just uses a ByteIterator to retrieve primitive data types or arrays of those from the underlying ByteString. And in the end it hands the assembled Message as an event to the next stage using the optimized singleEvent facility (see warning above).

Building a Pipeline

Given the two pipeline stages introduced in the sections above we can now put them to some use. First we define some message to be encoded:

val msg =
  Message(
    Seq(
      Person("Alice", "Gibbons"),
      Person("Bob", "Sparsely")),
    HappinessCurve(Array(1.0, 3.0, 5.0)))

Then we need to create a pipeline context which satisfies our declared needs:

val ctx = new HasByteOrder {
  def byteOrder = java.nio.ByteOrder.BIG_ENDIAN
}

Building the pipeline and encoding this message then is quite simple:

val stages =
  new MessageStage >>
    new LengthFieldFrame(10000)

// using the extractor for the returned case class here
val PipelinePorts(cmd, evt, mgmt) =
  PipelineFactory.buildFunctionTriple(ctx, stages)

val encoded: (Iterable[Message], Iterable[ByteString]) = cmd(msg)

The tuple returned from buildFunctionTriple contains one function for injecting commands, one for events and a third for injecting management commands (see below). In this case we demonstrate how a single message msg is encoded by passing it into the cmd function. The return value is a pair of sequences, one for the resulting events and the other for the resulting commands. For the sample pipeline this will contain exactly one command—one ByteString. Decoding works in the same way, only with the evt function (which can again also result in commands being generated, although that is not demonstrated in this sample).

Besides the more functional style there is also an explicitly side-effecting one:

val stages =
  new MessageStage >>
    new LengthFieldFrame(10000)

val injector = PipelineFactory.buildWithSinkFunctions(ctx, stages)(
  commandHandler ! _, // will receive messages of type Try[ByteString]
  eventHandler ! _ // will receive messages of type Try[Message]
  )

injector.injectCommand(msg)

The functions passed into the buildWithSinkFunctions factory method describe what shall happen to the commands and events as they fall out of the pipeline. In this case we just send those to some actors, since that is usually quite a good strategy for distributing the work represented by the messages.

The types of commands or events fed into the provided sink functions are wrapped within Try so that failures can also be encoded and acted upon. This means that injecting into a pipeline using a PipelineInjector will catch exceptions resulting from processing the input, in which case the exception (there can only be one per injection) is passed into the respective sink.

Using the Pipeline’s Context

Up to this point there was always a parameter ctx which was used when constructing a pipeline, but it was not explained in full. The context is a piece of information which is made available to all stages of a pipeline. The context may also carry behavior, provide infrastructure or helper methods etc. It should be noted that the context is bound to the pipeline and as such must not be accessed concurrently from different threads unless care is taken to properly synchronize such access. Since the context will in many cases be provided by an actor it is not recommended to share this context with code executing outside of the actor’s message handling.

Warning

A PipelineContext instance MUST NOT be used by two different pipelines since it contains mutable fields which are used during message processing.

Using Management Commands

Since pipeline stages do not have any reference to the pipeline or even to their neighbors they cannot directly effect the injection of commands or events outside of their normal processing. But sometimes things need to happen driven by a timer, for example. In this case the timer would need to cause sending tick messages to the whole pipeline, and those stages which wanted to receive them would act upon those. In order to keep the type signatures for events and commands useful, such external triggers are sent out-of-band, via a different channel—the management port. One example which makes use of this facility is the TickGenerator which comes included with akka-actor:

/**
 * This trait expresses that the pipeline’s context needs to live within an
 * actor and provide its ActorContext.
 */
trait HasActorContext extends PipelineContext {
  /**
   * Retrieve the [[ActorContext]] for this pipeline’s context.
   */
  def getContext: ActorContext
}

object TickGenerator {
  /**
   * This message type is used by the TickGenerator to trigger
   * the rescheduling of the next Tick. The actor hosting the pipeline
   * which includes a TickGenerator must arrange for messages of this
   * type to be injected into the management port of the pipeline.
   */
  trait Trigger

  /**
   * This message type is emitted by the TickGenerator to the whole
   * pipeline, informing all stages about the time at which this Tick
   * was emitted (relative to some arbitrary epoch).
   */
  case class Tick(@BeanProperty timestamp: FiniteDuration) extends Trigger
}

/**
 * This pipeline stage does not alter the events or commands
 */
class TickGenerator[Cmd <: AnyRef, Evt <: AnyRef](interval: FiniteDuration)
  extends PipelineStage[HasActorContext, Cmd, Cmd, Evt, Evt] {
  import TickGenerator._

  override def apply(ctx: HasActorContext) =
    new PipePair[Cmd, Cmd, Evt, Evt] {

      // use unique object to avoid double-activation on actor restart
      private val trigger: Trigger =
        new Trigger {
          override def toString = s"Tick[${ctx.getContext.self.path}]"
        }

      private def schedule() =
        ctx.getContext.system.scheduler.scheduleOnce(
          interval, ctx.getContext.self, trigger)(ctx.getContext.dispatcher)

      // automatically activate this generator
      schedule()

      override val commandPipeline = (cmd: Cmd)  ctx.singleCommand(cmd)

      override val eventPipeline = (evt: Evt)  ctx.singleEvent(evt)

      override val managementPort: Mgmt = {
        case `trigger` 
          ctx.getContext.self ! Tick(Deadline.now.time)
          schedule()
          Nil
      }
    }
}

This pipeline stage is to be used within an actor, and it will make use of this context in order to schedule the delivery of TickGenerator.Trigger messages; the actor is then supposed to feed these messages into the management port of the pipeline. An example could look like this:

class Processor(cmds: ActorRef, evts: ActorRef) extends Actor {

  val ctx = new HasActorContext with HasByteOrder {
    def getContext = Processor.this.context
    def byteOrder = java.nio.ByteOrder.BIG_ENDIAN
  }

  val pipeline = PipelineFactory.buildWithSinkFunctions(ctx,
    new TickGenerator(1000.millis) >>
      new MessageStage >>
      new LengthFieldFrame(10000) //
      )(
      // failure in the pipeline will fail this actor
      cmd  cmds ! cmd.get,
      evt  evts ! evt.get)

  def receive = {
    case m: Message                pipeline.injectCommand(m)
    case b: ByteString             pipeline.injectEvent(b)
    case t: TickGenerator.Trigger  pipeline.managementCommand(t)
  }
}

This actor extends our well-known pipeline with the tick generator and attaches the outputs to functions which send commands and events to actors for further processing. The pipeline stages will then all receive one Tick per second which can be used like so:

var lastTick = Duration.Zero

override val managementPort: Mgmt = {
  case TickGenerator.Tick(timestamp) 
    // omitted ...
    println(s"time since last tick: ${timestamp - lastTick}")
    lastTick = timestamp
    Nil
}

Note

Management commands are delivered to all stages of a pipeline “effectively parallel”, like on a broadcast medium. No code will actually run concurrently since a pipeline is strictly single-threaded, but the order in which these commands are processed is not specified.

The intended purpose of management commands is for each stage to define its special command types and then listen only to those (where the aforementioned Tick message is a useful counter-example), exactly like sending packets on a wifi network where every station receives all traffic but reacts only to those messages which are destined for it.

If you need all stages to react upon something in their defined order, then this must be modeled either as a command or event, i.e. it will be part of the “business” type of the pipeline.

Using TCP

The code snippets through-out this section assume the following imports:

import akka.actor.{ Actor, ActorRef, Props }
import akka.io.{ IO, Tcp }
import akka.util.ByteString
import java.net.InetSocketAddress

All of the Akka I/O APIs are accessed through manager objects. When using an I/O API, the first step is to acquire a reference to the appropriate manager. The code below shows how to acquire a reference to the Tcp manager.

import akka.io.{ IO, Tcp }
import context.system // implicitly used by IO(Tcp)

val manager = IO(Tcp)

The manager is an actor that handles the underlying low level I/O resources (selectors, channels) and instantiates workers for specific tasks, such as listening to incoming connections.

Connecting

object Client {
  def apply(remote: InetSocketAddress, replies: ActorRef) =
    Props(classOf[Client], remote, replies)
}

class Client(remote: InetSocketAddress, listener: ActorRef) extends Actor {

  import Tcp._
  import context.system

  IO(Tcp) ! Connect(remote)

  def receive = {
    case CommandFailed(_: Connect) 
      listener ! "failed"
      context stop self

    case c @ Connected(remote, local) 
      listener ! c
      val connection = sender
      connection ! Register(self)
      context become {
        case data: ByteString         connection ! Write(data)
        case CommandFailed(w: Write)  // O/S buffer was full
        case Received(data)           listener ! data
        case "close"                  connection ! Close
        case _: ConnectionClosed      context stop self
      }
  }
}

The first step of connecting to a remote address is sending a Connect message to the TCP manager; in addition to the simplest form shown above there is also the possibility to specify a local InetSocketAddress to bind to and a list of socket options to apply.

Note

The SO_NODELAY (TCP_NODELAY on Windows) socket option defaults to true in Akka, independently of the OS default settings. This setting disables Nagle's algorithm, considerably improving latency for most applications. This setting could be overridden by passing SO.TcpNoDelay(false) in the list of socket options of the Connect message.

The TCP manager will then reply either with a CommandFailed or it will spawn an internal actor representing the new connection. This new actor will then send a Connected message to the original sender of the Connect message.

In order to activate the new connection a Register message must be sent to the connection actor, informing that one about who shall receive data from the socket. Before this step is done the connection cannot be used, and there is an internal timeout after which the connection actor will shut itself down if no Register message is received.

The connection actor watches the registered handler and closes the connection when that one terminates, thereby cleaning up all internal resources associated with that connection.

The actor in the example above uses become to switch from unconnected to connected operation, demonstrating the commands and events which are observed in that state. For a discussion on CommandFailed see Throttling Reads and Writes below. ConnectionClosed is a trait, which marks the different connection close events. The last line handles all connection close events in the same way. It is possible to listen for more fine-grained connection close events, see Closing Connections below.

Accepting connections

object Server {
  def apply(manager: ActorRef) = Props(classOf[Server], manager)
}

class Server(manager: ActorRef) extends Actor {

  import Tcp._
  import context.system

  IO(Tcp) ! Bind(self, new InetSocketAddress("localhost", 0))

  def receive = {
    case b @ Bound(localAddress)  manager ! b

    case CommandFailed(_: Bind)   context stop self

    case c @ Connected(remote, local) 
      manager ! c
      val handler = context.actorOf(Props[SimplisticHandler])
      val connection = sender
      connection ! Register(handler)
  }

}

To create a TCP server and listen for inbound connections, a Bind command has to be sent to the TCP manager. This will instruct the TCP manager to listen for TCP connections on a particular InetSocketAddress; the port may be specified as 0 in order to bind to a random port.

The actor sending the Bind message will receive a Bound message signalling that the server is ready to accept incoming connections; this message also contains the InetSocketAddress to which the socket was actually bound (i.e. resolved IP address and correct port number).

From this point forward the process of handling connections is the same as for outgoing connections. The example demonstrates that handling the reads from a certain connection can be delegated to another actor by naming it as the handler when sending the Register message. Writes can be sent from any actor in the system to the connection actor (i.e. the actor which sent the Connected message). The simplistic handler is defined as:

class SimplisticHandler extends Actor {
  import Tcp._
  def receive = {
    case Received(data)  sender ! Write(data)
    case PeerClosed      context stop self
  }
}

For a more complete sample which also takes into account the possibility of failures when sending please see Throttling Reads and Writes below.

The only difference to outgoing connections is that the internal actor managing the listen port—the sender of the Bound message—watches the actor which was named as the recipient for Connected messages in the Bind message. When that actor terminates the listen port will be closed and all resources associated with it will be released; existing connections will not be terminated at this point.

Closing connections

A connection can be closed by sending one of the commands Close, ConfirmedClose or Abort to the connection actor.

Close will close the connection by sending a FIN message, but without waiting for confirmation from the remote endpoint. Pending writes will be flushed. If the close is successful, the listener will be notified with Closed.

ConfirmedClose will close the sending direction of the connection by sending a FIN message, but data will continue to be received until the remote endpoint closes the connection, too. Pending writes will be flushed. If the close is successful, the listener will be notified with ConfirmedClosed.

Abort will immediately terminate the connection by sending a RST message to the remote endpoint. Pending writes will be not flushed. If the close is successful, the listener will be notified with Aborted.

PeerClosed will be sent to the listener if the connection has been closed by the remote endpoint. Per default, the connection will then automatically be closed from this endpoint as well. To support half-closed connections set the keepOpenOnPeerClosed member of the Register message to true in which case the connection stays open until it receives one of the above close commands.

ErrorClosed will be sent to the listener whenever an error happened that forced the connection to be closed.

All close notifications are sub-types of ConnectionClosed so listeners who do not need fine-grained close events may handle all close events in the same way.

Throttling Reads and Writes

The basic model of the TCP connection actor is that it has no internal buffering (i.e. it can only process one write at a time, meaning it can buffer one write until it has been passed on to the O/S kernel in full). Congestion needs to be handled at the user level, for which there are three modes of operation:

  • ACK-based: every Write command carries an arbitrary object, and if this object is not Tcp.NoAck then it will be returned to the sender of the Write upon successfully writing all contained data to the socket. If no other write is initiated before having received this acknowledgement then no failures can happen due to buffer overrun.
  • NACK-based: every write which arrives while a previous write is not yet completed will be replied to with a CommandFailed message containing the failed write. Just relying on this mechanism requires the implemented protocol to tolerate skipping writes (e.g. if each write is a valid message on its own and it is not required that all are delivered). This mode is enabled by setting the useResumeWriting flag to false within the Register message during connection activation.
  • NACK-based with write suspending: this mode is very similar to the NACK-based one, but once a single write has failed no further writes will succeed until a ResumeWriting message is received. This message will be answered with a WritingResumed message once the last accepted write has completed. If the actor driving the connection implements buffering and resends the NACK’ed messages after having awaited the WritingResumed signal then every message is delivered exactly once to the network socket.

These models (with the exception of the second which is rather specialised) are demonstrated in complete examples below. The full and contiguous source is available on github.

Note

It should be obvious that all these flow control schemes only work between one writer and one connection actor; as soon as multiple actors send write commands to a single connection no consistent result can be achieved.

ACK-Based Back-Pressure

For proper function of the following example it is important to configure the connection to remain half-open when the remote side closed its writing end: this allows the example EchoHandler to write all outstanding data back to the client before fully closing the connection. This is enabled using a flag upon connection activation (observe the Register message):

case Connected(remote, local) 
  log.info("received connection from {}", remote)
  val handler = context.actorOf(Props(handlerClass, sender, remote))
  sender ! Register(handler, keepOpenOnPeerClosed = true)

With this preparation let us dive into the handler itself:

  // storage omitted ...
class SimpleEchoHandler(connection: ActorRef, remote: InetSocketAddress)
  extends Actor with ActorLogging {

  import Tcp._

  // sign death pact: this actor terminates when connection breaks
  context watch connection

  case object Ack

  def receive = {
    case Received(data) 
      buffer(data)
      connection ! Write(data, Ack)

      context.become({
        case Received(data)  buffer(data)
        case Ack             acknowledge()
        case PeerClosed      closing = true
      }, discardOld = false)

    case PeerClosed  context stop self
  }

  // storage omitted ...
}

The principle is simple: when having written a chunk always wait for the Ack to come back before sending the next chunk. While waiting we switch behavior such that new incoming data are buffered. The helper functions used are a bit lengthy but not complicated:

private def buffer(data: ByteString): Unit = {
  storage :+= data
  stored += data.size

  if (stored > maxStored) {
    log.warning(s"drop connection to [$remote] (buffer overrun)")
    context stop self

  } else if (stored > highWatermark) {
    log.debug(s"suspending reading")
    connection ! SuspendReading
    suspended = true
  }
}

private def acknowledge(): Unit = {
  require(storage.nonEmpty, "storage was empty")

  val size = storage(0).size
  stored -= size
  transferred += size

  storage = storage drop 1

  if (suspended && stored < lowWatermark) {
    log.debug("resuming reading")
    connection ! ResumeReading
    suspended = false
  }

  if (storage.isEmpty) {
    if (closing) context stop self
    else context.unbecome()
  } else connection ! Write(storage(0), Ack)
}

The most interesting part is probably the last: an Ack removes the oldest data chunk from the buffer, and if that was the last chunk then we either close the connection (if the peer closed its half already) or return to the idle behavior; otherwise we just send the next buffered chunk and stay waiting for the next Ack.

Back-pressure can be propagated also across the reading side back to the writer on the other end of the connection by sending the SuspendReading command to the connection actor. This will lead to no data being read from the socket anymore (although this does happen after a delay because it takes some time until the connection actor processes this command, hence appropriate head-room in the buffer should be present), which in turn will lead to the O/S kernel buffer filling up on our end, then the TCP window mechanism will stop the remote side from writing, filling up its write buffer, until finally the writer on the other side cannot push any data into the socket anymore. This is how end-to-end back-pressure is realized across a TCP connection.

NACK-Based Back-Pressure with Write Suspending

class EchoHandler(connection: ActorRef, remote: InetSocketAddress)
  extends Actor with ActorLogging {

  import Tcp._

  // sign death pact: this actor terminates when connection breaks
  context watch connection

  // start out in optimistic write-through mode
  def receive = writing

  def writing: Receive = {
    case Received(data) 
      connection ! Write(data, currentOffset)
      buffer(data)

    case ack: Int 
      acknowledge(ack)

    case CommandFailed(Write(_, ack: Int)) 
      connection ! ResumeWriting
      context become buffering(ack)

    case PeerClosed 
      if (storage.isEmpty) context stop self
      else context become closing
  }

  // buffering ...

  // closing ...

  override def postStop(): Unit = {
    log.info(s"transferred $transferred bytes from/to [$remote]")
  }

  // storage omitted ...
}
  // storage omitted ...

The principle here is to keep writing until a CommandFailed is received, using acknowledgements only to prune the resend buffer. When a such a failure was received, transition into a different state for handling and handle resending of all queued data:

def buffering(nack: Int): Receive = {
  var toAck = 10
  var peerClosed = false

  {
    case Received(data)         ⇒ buffer(data)
    case WritingResumed         ⇒ writeFirst()
    case PeerClosed             ⇒ peerClosed = true
    case ack: Int if ack < nack ⇒ acknowledge(ack)
    case ack: Int ⇒
      acknowledge(ack)
      if (storage.nonEmpty) {
        if (toAck > 0) {
          // stay in ACK-based mode for a while
          writeFirst()
          toAck -= 1
        } else {
          // then return to NACK-based again
          writeAll()
          context become (if (peerClosed) closing else writing)
        }
      } else if (peerClosed) context stop self
      else context become writing
  }
}

It should be noted that all writes which are currently buffered have also been sent to the connection actor upon entering this state, which means that the ResumeWriting message is enqueued after those writes, leading to the reception of all outstanding CommandFailre messages (which are ignored in this state) before receiving the WritingResumed signal. That latter message is sent by the connection actor only once the internally queued write has been fully completed, meaning that a subsequent write will not fail. This is exploited by the EchoHandler to switch to an ACK-based approach for the first ten writes after a failure before resuming the optimistic write-through behavior.

def closing: Receive = {
  case CommandFailed(_: Write) 
    connection ! ResumeWriting
    context.become({

      case WritingResumed 
        writeAll()
        context.unbecome()

      case ack: Int  acknowledge(ack)

    }, discardOld = false)

  case ack: Int 
    acknowledge(ack)
    if (storage.isEmpty) context stop self
}

Closing the connection while still sending all data is a bit more involved than in the ACK-based approach: the idea is to always send all outstanding messages and acknowledge all successful writes, and if a failure happens then switch behavior to await the WritingResumed event and start over.

The helper functions are very similar to the ACK-based case:

private def buffer(data: ByteString): Unit = {
  storage :+= data
  stored += data.size

  if (stored > maxStored) {
    log.warning(s"drop connection to [$remote] (buffer overrun)")
    context stop self

  } else if (stored > highWatermark) {
    log.debug(s"suspending reading at $currentOffset")
    connection ! SuspendReading
    suspended = true
  }
}

private def acknowledge(ack: Int): Unit = {
  require(ack == storageOffset, s"received ack $ack at $storageOffset")
  require(storage.nonEmpty, s"storage was empty at ack $ack")

  val size = storage(0).size
  stored -= size
  transferred += size

  storageOffset += 1
  storage = storage drop 1

  if (suspended && stored < lowWatermark) {
    log.debug("resuming reading")
    connection ! ResumeReading
    suspended = false
  }
}

Usage Example: TcpPipelineHandler and SSL

This example shows the different parts described above working together:

class AkkaSslServer extends Actor with ActorLogging {

  import Tcp.Connected

  def receive: Receive = {
    case Connected(remote, _) 
      val init =
        new TcpPipelineHandler.Init(
          new StringByteStringAdapter >>
            new DelimiterFraming(maxSize = 1024, delimiter = ByteString('\n'), includeDelimiter = true) >>
            new TcpReadWriteAdapter[HasLogging] >>
            new SslTlsSupport(sslEngine(remote, client = false))) {
          override def makeContext(actorContext: ActorContext): HasLogging =
            new HasLogging {
              override def getLogger = log
            }
        }
      import init._

      val connection = sender
      val handler = system.actorOf(
        TcpPipelineHandler(init, sender, self), "server" + counter.incrementAndGet())

      connection ! Tcp.Register(handler)

      context become {
        case Event(data) 
          val input = data.dropRight(1)
          log.debug("akka-io Server received {} from {}", input, sender)
          val response = serverResponse(input)
          sender ! Command(response)
          log.debug("akka-io Server sent: {}", response.dropRight(1))
      }
  }
}

The actor above is meant to be registered as the inbound connection handler for a listen socket. When a new connection comes in it will create a javax.net.ssl.SSLEngine (details not shown here since they vary wildly for different setups, please refer to the JDK documentation) and wrap that in an SslTlsSupport pipeline stage (which is included in akka-actor). This single-stage pipeline will be driven by a TcpPipelineHandler actor which is also included in akka-actor. In order to capture the generic command and event types consumed and emitted by that actor we need to create a wrapper—the nested Init class—which also provides the makeContext method for creating the pipeline context needed by the supplied pipeline. With those things bundled up all that remains is creating a TcpPipelineHandler and registering that one as the recipient of inbound traffic from the TCP connection.

Since we instructed that handler actor to send any events which are emitted by the SSL pipeline to ourselves, we can then just switch behavior to receive the decrypted payload message, compute a response and reply by sending back a Tcp.Write. It should be noted that communication with the handler wraps commands and events in the inner types of the init object in order to keep things well separated.

Warning

The TcpPipelineHandler does currently not handle back-pressure from the TCP socket, i.e. it will just lose data when the kernel buffer overflows. This will be fixed before Akka 2.2 final.

Using UDP

UDP support comes in two flavors: connectionless and connection-based. With connectionless UDP, workers can send datagrams to any remote address. Connection-based UDP workers are linked to a single remote address.

The connectionless UDP manager is accessed through Udp. Udp refers to the "fire-and-forget" style of sending UDP datagrams.

import akka.io.IO
import akka.io.Udp
val connectionLessUdp = IO(Udp)

The connection-based UDP manager is accessed through UdpConnected.

import akka.io.UdpConnected
val connectionBasedUdp = IO(UdpConnected)

UDP servers can be only implemented by the connectionless API, but clients can use both.

Connectionless UDP

Simple Send

To simply send a UDP datagram without listening to an answer one needs to send the SimpleSender command to the Udp manager:

IO(Udp) ! SimpleSender
// or with socket options:
import akka.io.Udp._
IO(Udp) ! SimpleSender(List(SO.Broadcast(true)))

The manager will create a worker for sending, and the worker will reply with a SimpleSendReady message:

case SimpleSendReady =>
  simpleSender = sender

After saving the sender of the SimpleSendReady message it is possible to send out UDP datagrams with a simple message send:

simpleSender ! Send(data, serverAddress)

Bind (and Send)

To listen for UDP datagrams arriving on a given port, the Bind command has to be sent to the connectionless UDP manager

IO(Udp) ! Bind(handler, localAddress)

After the bind succeeds, the sender of the Bind command will be notified with a Bound message. The sender of this message is the worker for the UDP channel bound to the local address.

case Bound =>
  udpWorker = sender // Save the worker ref for later use

The actor passed in the handler parameter will receive inbound UDP datagrams sent to the bound address:

case Received(dataByteString, remoteAddress) => // Do something with the data

The Received message contains the payload of the datagram and the address of the sender.

It is also possible to send UDP datagrams using the ActorRef of the worker saved in udpWorker:

udpWorker ! Send(data, serverAddress)

Note

The difference between using a bound UDP worker to send instead of a simple-send worker is that in the former case the sender field of the UDP datagram will be the bound local address, while in the latter it will be an undetermined ephemeral port.

Connection based UDP

The service provided by the connection based UDP API is similar to the bind-and-send service we saw earlier, but the main difference is that a connection is only able to send to the remoteAddress it was connected to, and will receive datagrams only from that address.

Connecting is similar to what we have seen in the previous section:

IO(UdpConnected) ! Connect(handler, remoteAddress)

Or, with more options:

IO(UdpConnected) ! Connect(handler, Some(localAddress), remoteAddress, List(SO.Broadcast(true)))

After the connect succeeds, the sender of the Connect command will be notified with a Connected message. The sender of this message is the worker for the UDP connection.

case Connected =>
  udpConnectionActor = sender // Save the worker ref for later use

The actor passed in the handler parameter will receive inbound UDP datagrams sent to the bound address:

case Received(dataByteString) => // Do something with the data

The Received message contains the payload of the datagram but unlike in the connectionless case, no sender address is provided, as a UDP connection only receives messages from the endpoint it has been connected to.

UDP datagrams can be sent by sending a Send message to the worker actor.

udpConnectionActor ! Send(data)

Again, like the Received message, the Send message does not contain a remote address. This is because the address will always be the endpoint we originally connected to.

Note

There is a small performance benefit in using connection based UDP API over the connectionless one. If there is a SecurityManager enabled on the system, every connectionless message send has to go through a security check, while in the case of connection-based UDP the security check is cached after connect, thus writes do not suffer an additional performance penalty.

Architecture in-depth

For further details on the design and internal architecture see I/O Layer Design.

Contents