Text and charsets

The text flows allow to translate a stream of text data according to the used character sets. It supports conversion between ByteString and String, as well as conversion of the character set in binary text data in the form of ByteStrings.

The main use case for these flows is the transcoding of text read from a source with a certain character set, which may not be usable with other flows or sinks. For example may CSV data arrive in UTF-16 encoding, but the Alpakka CSV parser does only support UTF-8.

Project Info: Alpakka Text
Artifact
com.lightbend.akka
akka-stream-alpakka-text
2.0.2
JDK versions
Adopt OpenJDK 8
Adopt OpenJDK 11
Scala versions2.12.11, 2.11.12, 2.13.3
JPMS module nameakka.stream.alpakka.text
License
Readiness level
Since 0.20, 2018-07-04
Home pagehttps://doc.akka.io/docs/alpakka/current
API documentation
Forums
Release notesIn the documentation
IssuesGithub issues
Sourceshttps://github.com/akka/alpakka

Artifacts

sbt
val AkkaVersion = "2.5.31"
libraryDependencies ++= Seq(
  "com.lightbend.akka" %% "akka-stream-alpakka-text" % "2.0.2",
  "com.typesafe.akka" %% "akka-stream" % AkkaVersion
)
Maven
<properties>
  <akka.version>2.5.31</akka.version>
  <scala.binary.version>2.12</scala.binary.version>
</properties>
<dependency>
  <groupId>com.lightbend.akka</groupId>
  <artifactId>akka-stream-alpakka-text_${scala.binary.version}</artifactId>
  <version>2.0.2</version>
</dependency>
<dependency>
  <groupId>com.typesafe.akka</groupId>
  <artifactId>akka-stream_${scala.binary.version}</artifactId>
  <version>${akka.version}</version>
</dependency>
Gradle
versions += [
  AkkaVersion: "2.5.31",
  ScalaBinary: "2.12"
]
dependencies {
  compile group: 'com.lightbend.akka', name: "akka-stream-alpakka-text_${versions.ScalaBinary}", version: '2.0.2',
  compile group: 'com.typesafe.akka', name: "akka-stream_${versions.ScalaBinary}", version: versions.AkkaVersion
}

The table below shows direct dependencies of this module and the second tab shows all libraries it depends on transitively.

Direct dependencies
OrganizationArtifactVersion
com.typesafe.akkaakka-stream_2.122.5.31
org.scala-langscala-library2.12.11
Dependency tree
com.typesafe.akka    akka-stream_2.12    2.5.31
    com.typesafe.akka    akka-actor_2.12    2.5.31
        com.typesafe    config    1.3.3
        org.scala-lang.modules    scala-java8-compat_2.12    0.8.0
            org.scala-lang    scala-library    2.12.11
        org.scala-lang    scala-library    2.12.11
    com.typesafe.akka    akka-protobuf_2.12    2.5.31
        org.scala-lang    scala-library    2.12.11
    com.typesafe    ssl-config-core_2.12    0.3.8
        com.typesafe    config    1.3.3
        org.scala-lang.modules    scala-parser-combinators_2.12    1.1.2
            org.scala-lang    scala-library    2.12.11
        org.scala-lang    scala-library    2.12.11
    org.reactivestreams    reactive-streams    1.0.2
    org.scala-lang    scala-library    2.12.11
org.scala-lang    scala-library    2.12.11

Text transcoding

The text transcoding flow converts incoming binary text data (ByteString) to binary text data of another character encoding.

The flow fails with an UnmappableCharacterException, if a character is not representable in the targeted character set.

Scala
import java.nio.charset.StandardCharsets
import akka.stream.scaladsl.FileIO
import akka.stream.alpakka.text.scaladsl.TextFlow

val byteStringSource: Source[ByteString, _] = // ...

byteStringSource
  .via(TextFlow.transcoding(StandardCharsets.UTF_16, StandardCharsets.UTF_8))
  .runWith(FileIO.toPath(targetFile))
Java
Source<ByteString, ?> byteStringSource = // ...
    byteStringSource
        .via(TextFlow.transcoding(StandardCharsets.UTF_16, StandardCharsets.UTF_8))
        .runWith(FileIO.toPath(targetFile), materializer);

Text encoding

The text encoding flow converts incoming Strings to binary text data (ByteString) with the given character encoding.

The flow fails with an UnmappableCharacterException, if a character is not representable in the targeted character set.

Scala
import java.nio.charset.StandardCharsets
import akka.stream.alpakka.text.scaladsl.TextFlow
import akka.stream.scaladsl.FileIO

val stringSource: Source[String, _] = // ...

stringSource
  .via(TextFlow.encoding(StandardCharsets.US_ASCII))
  .intersperse(ByteString("\n"))
  .runWith(FileIO.toPath(targetFile))
Java
import akka.stream.alpakka.testkit.javadsl.LogCapturingJunit4;
import akka.stream.alpakka.text.javadsl.TextFlow;
import akka.stream.IOResult;
import akka.stream.javadsl.FileIO;
import akka.stream.javadsl.Sink;
import akka.stream.javadsl.Source;
import akka.util.ByteString;

import java.nio.charset.StandardCharsets;

Source<String, ?> stringSource = // ...
    stringSource
        .via(TextFlow.encoding(StandardCharsets.US_ASCII))
        .intersperse(ByteString.fromString("\n"))
        .runWith(FileIO.toPath(targetFile), materializer);

Text decoding

The text decoding flow converts incoming ByteStrings to Strings using the given character encoding.

Scala
import java.nio.charset.StandardCharsets
import akka.stream.alpakka.text.scaladsl.TextFlow

val byteStringSource: Source[ByteString, _] = // ...

val result: Future[immutable.Seq[String]] =
  byteStringSource
    .via(TextFlow.decoding(StandardCharsets.UTF_16))
    .runWith(Sink.seq)
Java
Source<ByteString, ?> byteStringSource = // ...
    byteStringSource
        .via(TextFlow.decoding(StandardCharsets.UTF_16))
        .runWith(Sink.seq(), materializer);
Found an error in this documentation? The source code for this page can be found here. Please feel free to edit and contribute a pull request.