Extensible Markup Language - XML
XML parsing module offers Flows for parsing, processing and writing XML documents.
[+] Show project infoProject Info: Alpakka XML | |
---|---|
Artifact | com.lightbend.akka
akka-stream-alpakka-xml
9.0.1
|
JDK versions | Eclipse Temurin JDK 11 Eclipse Temurin JDK 17 |
Scala versions | 2.13.12, 3.3.4 |
JPMS module name | akka.stream.alpakka.xml |
License | |
Readiness level |
Since 0.9, 2018-05-24
|
Home page | https://doc.akka.io/libraries/alpakka/current |
API documentation | |
Forums | |
Release notes | GitHub releases |
Issues | Github issues |
Sources | https://github.com/akka/alpakka |
Artifacts
The Akka dependencies are available from Akka’s library repository. To access them there, you need to configure the URL for this repository.
Additionally, add the dependencies as below.
- sbt
val AkkaVersion = "2.10.0" libraryDependencies ++= Seq( "com.lightbend.akka" %% "akka-stream-alpakka-xml" % "9.0.1", "com.typesafe.akka" %% "akka-stream" % AkkaVersion )
- Maven
- Gradle
The table below shows direct dependencies of this module and the second tab shows all libraries it depends on transitively.
- Direct dependencies
Organization Artifact Version com.fasterxml aalto-xml 1.3.3 com.typesafe.akka akka-stream_2.13 2.10.0 org.scala-lang scala-library 2.13.12 - Dependency tree
XML parsing
XML processing pipeline starts with an XmlParsing.parser
flow which parses a stream of ByteString
s to XML parser events.
- Scala
- Java
-
source
final Sink<String, CompletionStage<List<ParseEvent>>> parse = Flow.<String>create() .map(ByteString::fromString) .via(XmlParsing.parser()) .toMat(Sink.seq(), Keep.right());
To parse an XML document run XML document source with this parser.
- Scala
- Java
-
source
final String doc = "<doc><elem>elem1</elem><elem>elem2</elem></doc>"; final CompletionStage<List<ParseEvent>> resultStage = Source.single(doc).runWith(parse, system);
To make sense of the parser events, statefulMapConcat
may be used to aggregate consecutive events and emit the relevant data. For more complex uses, a state machine will be required.
- Scala
- Java
-
source
ByteString doc = ByteString.fromString("<doc><elem>elem1</elem><elem>elem2</elem></doc>"); CompletionStage<List<String>> stage = Source.single(doc) .via(XmlParsing.parser()) .statefulMapConcat( () -> { // state final StringBuilder textBuffer = new StringBuilder(); // aggregation function return parseEvent -> { switch (parseEvent.marker()) { case XMLStartElement: textBuffer.delete(0, textBuffer.length()); return Collections.emptyList(); case XMLEndElement: EndElement s = (EndElement) parseEvent; switch (s.localName()) { case "elem": String text = textBuffer.toString(); return Collections.singleton(text); default: return Collections.emptyList(); } case XMLCharacters: case XMLCData: TextEvent t = (TextEvent) parseEvent; textBuffer.append(t.text()); return Collections.emptyList(); default: return Collections.emptyList(); } }; }) .runWith(Sink.seq(), system); List<String> list = stage.toCompletableFuture().get(5, TimeUnit.SECONDS); assertThat(list, hasItems("elem1", "elem2"));
XML writing
XML processing pipeline ends with an XmlWriting.writer
flow which writes a stream of XML parser events to ByteString
s.
- Scala
- Java
-
source
final Sink<ParseEvent, CompletionStage<String>> write = Flow.of(ParseEvent.class) .via(XmlWriting.writer()) .map(ByteString::utf8String) .toMat(Sink.fold("", (acc, el) -> acc + el), Keep.right()); final Sink<ParseEvent, CompletionStage<String>> write = Flow.of(ParseEvent.class) .via(XmlWriting.writer()) .map(ByteString::utf8String) .toMat(Sink.fold("", (acc, el) -> acc + el), Keep.right()); final Sink<ParseEvent, CompletionStage<String>> write = Flow.of(ParseEvent.class) .via(XmlWriting.writer(xmlOutputFactory)) .map(ByteString::utf8String) .toMat(Sink.fold("", (acc, el) -> acc + el), Keep.right());
To write an XML document run XML document source with this writer.
- Scala
- Java
-
source
final String doc = "<?xml version='1.0' encoding='UTF-8'?>" + "<bk:book xmlns:bk=\"urn:loc.gov:books\" xmlns:isbn=\"urn:ISBN:0-395-36341-6\">" + "<bk:title>Cheaper by the Dozen</bk:title><isbn:number>1568491379</isbn:number></bk:book>"; final List<Namespace> nmList = new ArrayList<>(); nmList.add(Namespace.create("urn:loc.gov:books", Optional.of("bk"))); nmList.add(Namespace.create("urn:ISBN:0-395-36341-6", Optional.of("isbn"))); final List<ParseEvent> docList = new ArrayList<>(); docList.add(StartDocument.getInstance()); docList.add( StartElement.create( "book", Collections.emptyList(), Optional.of("bk"), Optional.of("urn:loc.gov:books"), nmList)); docList.add( StartElement.create( "title", Collections.emptyList(), Optional.of("bk"), Optional.of("urn:loc.gov:books"))); docList.add(Characters.create("Cheaper by the Dozen")); docList.add(EndElement.create("title")); docList.add( StartElement.create( "number", Collections.emptyList(), Optional.of("isbn"), Optional.of("urn:ISBN:0-395-36341-6"))); docList.add(Characters.create("1568491379")); docList.add(EndElement.create("number")); docList.add(EndElement.create("book")); docList.add(EndDocument.getInstance()); final CompletionStage<String> resultStage = Source.from(docList).runWith(write, system);
XML Subslice
Use XmlParsing.subslice
to filter out all elements not corresponding to a certain path.
- Scala
- Java
-
source
final Sink<String, CompletionStage<List<ParseEvent>>> parse = Flow.<String>create() .map(ByteString::fromString) .via(XmlParsing.parser()) .via(XmlParsing.subslice(Arrays.asList("doc", "elem", "item"))) .toMat(Sink.seq(), Keep.right());
To get a subslice of an XML document run XML document source with this parser.
- Scala
- Java
-
source
final String doc = "<doc>" + " <elem>" + " <item>i1</item>" + " <item><sub>i2</sub></item>" + " <item>i3</item>" + " </elem>" + "</doc>"; final CompletionStage<List<ParseEvent>> resultStage = Source.single(doc).runWith(parse, system);
XML Subtree
Use XmlParsing.subtree
to handle elements matched to a certain path and their child nodes as org.w3c.dom.Element
.
- Scala
- Java
-
source
final Sink<String, CompletionStage<List<Element>>> parse = Flow.<String>create() .map(ByteString::fromString) .via(XmlParsing.parser()) .via(XmlParsing.subtree(Arrays.asList("doc", "elem", "item"))) .toMat(Sink.seq(), Keep.right());
To get a subtree of an XML document run XML document source with this parser.