In this article, we will learn how to use JAXB for working with XML, especially when we utilize JAX-WS framework. Let’s get started.
Table of contents
- Introduction about other tools
- XML structure
- Understanding the JAXB API
- Java code example
- Benefits and drawbacks
- Wrapping up
Introduction about other tools
Java has different APIs for working with XML, which means that there are different ways that we can read or write data in XML in Java. Which one of the API is best for the job depends on what exactly we need to do with XML.
We have the lower level APIs such as DOM, SAX, StAX, and the higher level such as JAXB.
DOM - Document Object Model
When we use DOM API to read an XML file, we are building a tree in memory that consists of nodes that directly correspond to the elements, attributes, text and other items in the XML.
This is a straightforward and relatively easy way to work with the XML. We read the XML and then we get a tree of nodes that we can process as needed in our application. The DOM API consists of the
org.w3c.dompackage in the JDK, which contains interfaces such as document, elements, and attr, which are representations of an XML document, element, and attribute.
- The DOM API is easy to use.
- It does not scale well to large documents. Before our program can process the XML, the whole document is loaded into memory. If the document is very large, this can take up a lot of memory. Therefore, if we need to process large documents, we can choose to use the SAX or StAX instead.
SAX - Simple API for XML
SAX works in a completely different way than the DOM API. Instead of converting the XML document into a tree of nodes in-memory, it’s an
event-basedAPI. The SAX parser reads the XML and calls callback methods in our program whenever it encounters.
For example: the start tag, text, and end tag or something else in the XML.
The callback methods in our program then inspect what is found and take the appropriate action. The SAX API consists of the package
org.xml.saxand related packages in the JDK. If we use the SAX API, we’ll most likely want to implement the ContentHandler interface, which defines the callback methods that the SAX parser will call.
- Since it does not need to load the whole document in memory, it works on large XML files just as well as on small XML files.
- The SAX API is a bit more cumbersome to use than the DOM API.
StAX - Streaming API for XML
StAX APIs is similar to that of the SAX APIs. It’s
event-based. The main difference between the SAX and the StAX APIs is that
SAX is push basedwhile
StAX is pull based.
This means that with SAX, it’s the parser that is in control and that calls the callback methods in our application, so it pushes events to our application. While StAX, our program is in control, and it calls the StAX API to get the next event out of the XML that’s being processed, so we are pulling the events out of the parser.
The StAX API consists of the package
javax.xml.streamand related packages in the JDK. The most important intefaces are
XMLStreamWriterfor reading and writing XML. StAX API is a bit more convenient to use than the SAX API in many cases, but it’s still
low level API, which means that we have to deal with all the details of parsing, we’ll likely end up writing a lot of boilerplate code, even for parsing relatively simple XML documents.
JAXB - Java Architecture for XML Binding
JAXB is an acronym that stands for Java Architecture for XML binding. What we can do with JAXB is to convert Java objects to XML and vice versa.
The word binding refers to the mapping between Java classes and fields to structures in XML such as elements and attributes. JAXB works with XML schema files. When we work with JAXB, we are working with two representations of our domain model.
On the Java side, we have a number of Java classes that define the domain model, and on the XML side, we have an XML schema that defines the same domain model. It would, of course, be cumbersome if we had to manually keep the Java and XML domain models synchronized. It’s better to start from either Java or an XML schema and then generate either schema from our Java classes or generate Java from our schema. This corresponds to the two approaches to work with JAXB:
The code first approach
We will generate an XML schema, an XSD file, from our Java domain model classes. We give this XSD file to our business partner who needs to work with the XML that our software produces, so that we know that the XML looks like.
The schema first approach
We will start with an XML schema, an XSD file, and we generate the source code for our Java domain model classes from the schema. This is useful, for example, if we get the XSD file from a business partner or from an information analyst or architect in our own company.
- XML and namespaces
- XML XML stands for Extensible Markup Language. It’s the standard text-base format for storing arbitrary structure data.
An XML document contains elements, and each element starts and ends with a tag. The start tag consists of the name of the element between angle brackets. The end tag is the same as the start tag except that there is a forward slash before the element name. The start tag can optionally have attributes, such as orderDate attribute on this purchaseOrder start tag, which are specified after the element name and separated by spaces. Attributes have a name and a value. The content of an element is what’s between the start and end tags. This can be text or other elements. The element’s productName, quantity, price, and so on, have text content …
<?xml version="1.0" encoding="UTF-8" ?> <purchaseOrder xmlns="http://www.jesperdj.com/ps/jaxb" orderDate="2017-09-10"> <items> <item> <productName>Ballpoint Pen</productName> <quantity>20</quantity> <price>8.95</price> <comment>Blue ink</comment> </item> </items> <customer> <name>John Doe</name> <shippingAddress> <street>123 Main Street</street> <city>Exampleville</city> </shippingAddress> </customer> </purchaseOrder>
The fact that we can nest elements is a very powerful idea, and that’s what makes it possible to store almost any kind of structure data in XML.
If an element has no content, the start and end tag can be combined into a single tag with a forward slash after the element name. That’s just a shorter way to write the start and end tag right next to each other with nothing in between.
Namespaces in XML are a bit like packages in Java. A namespace keeps a set of related tag names together, just like a Java package keeps related classes together. When we define a set of tag names for our application, it’s good idea to define a namespace to contain our tag names.
Let’s look at the syntax that is used to refer to namespaces. A start tag can have a special attribute with a name xmlns. The value of this attribute is the name of a namespace, and it specifies that the tag and its child tags belong to that namespace. The name of a namespace is a URI, a uniform resource identifier. It often looks like a URL, and it’s good idea to choose a URL that refers to a world wide web domain name that we own. This is exactly the same as with Java package names where we usually use a package name that corresponds to a web domain, like com.mycompany.mysoftware.
URL does not need to point to any real resource on the web, so we do not need to have a server running that responds to the URL. Through the XML parser, it’s just a string that uniquely identifies the namespace. Sometimes we’ll need to use text from multiple different namespaces in our XML document. We can use namespace prefixes to indicate explicitly to which namespace a tag belongs. The declare a namespace with a prefix, we have to modify the xmlns attribute slightly by putting a colon and then a prefix name after it. We can use this prefix name in front of tag names that should be in that namespace.
JAXBmakes use of XML schema, so it’s important that we understand what XML schema is. Let’s take a quick look at the most important concepts of XML schema.
HTML, does not have a fixed set of tags. When we’re going to use
XMLfor our application, we’ll be inventing our own set of tags that have meaning in the context of our application. An
XML schemadescribes the data model of an XML file, what elements can appear in the XML, what the content of these elements can be, and what attributes they can have.
XML processing toolcan use the schema, for example, to check if an XML document is valid according to the schema.
There are different standard schema languages for XML. The original schema language, which was invented together with XML itself, is
DTD, which stands for
Document Type Definition, but DTD has limitations. For example, it does not support namespaces and it does not support data types for the content of elements and attributes. So, there is, for example, no way to specify in a
DTDthat a certain element should contain a number of a date.
The most widely used standard schema language is
XML schema. If we are working with JAXB, it’s important to understand the
XML schemabecause JAXB heavily makes use of it.
XML schemafiles are XML files themselves and have the extension
XSD, which stands for
XML schema Definition.
If we want to know everything about
XML schema, we can look up the specifications on the website of the
World Wide Web Consortium, the
W3C, but be aware that the official specification is a very dry and technical document, which is hard to read. Fortunately, the
W3Calso has a more easy-to-read tutorial, the
XML schema Primer.
XSDfile which defines a small domain model for
<?xml version="1.0" encoding="UTF-8" ?> <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"> <xs:element name="purchaseOrder"> <xs:complexType> <xs:sequence> <xs:element name="items" type="Items" /> <xs:element name="customer" type="Customer" /> <xs:element ref="comment" minOccurs="0" /> </xs:sequence> <xs:attribute name="orderDate" type="xs:date" use="required" /> </xs:complexType> </xs:element> <xs:complexType name="Items"> <xs:sequence> <xs:element name="item" minOccurs="0" maxOccurs="unbounded"> <xs:complexType> <xs:sequence> <xs:element name="productName" type="xs:string" /> <xs:element name="quantity"> <xs:simpleType> <xs:restriction base="xs:positiveInteger"> <xs:maxExclusive value="100"> </xs:restriction> </xs:simpleType> </xs:element> <xs:element name="price" type="xs:decimal" /> <xs:element ref="comment" minOccurs="0" /> </xs:sequence> </xs:complexType> </xs:element> </xs:sequence> </xs:complexType> <xs:complexType name="Customer"> <xs:sequence> <xs:element name="name" type="xs:string" /> <xs:element name="shippingAddress" type="Address" /> <xs:element name="billingAddress" type="Address" /> <xs:element name="loyalty" type="Loyalty" /> </xs:sequence> </xs:complexType> <xs:complexType name="Address"> <xs:sequence> <xs:element name="street" type="xs:string" /> <xs:element name="city" type="xs:string" /> <xs:element name="postalCode" type="xs:string" /> <xs:element name="country" type="xs:string" /> </xs:sequence> </xs:complexType> <xs:complexType name="Loyalty"> <xs:restriction base="xs:string"> <xs:enumeration value="BRONZE"> <xs:enumeration value="SILVER"> <xs:enumeration value="GOLD"> </xs:restriction> </xs:complexType> <xs:element name="comment"> <xs:simpleType> <xs:restriction base="xs:string"> <xs:maxLength value="1000"> </xs:restriction> </xs:simpleType> </xs:element> </xs:schema>
The tags that can be used in XML schema are defined in the standard namespace that is defined by the URI. We are using the namespace prefix
xsfor the XML schema tags.
Note that we can, in principle, choose any prefix we like within our document, but
xsis what is conventionally used for XSD files. The root element is an
xs:schemaelement. We can define a handful of different things at the root level.
The most important things that it can define at root level are
Let’s take a look at the definition of
purchaseOrderelement. This element has a complex type. There are two kinds of types in XML schema. There are simple types, which are for the text content of elements and attributes. We will take a look at those in a moment. Complex types are primarily for elements that can contain other nested elements. What we see here is that a
purchaseOrderelement must contain a sequence of three other elements: an items, a customer, and a comment element.
Note that because it’s a sequence, the elements must appear in the
purchaseOrderelement exactly in this order. If the XML would contain the
customerelement before the
itemselement, for example, then that would be an error and the XML would not be valid. The type of
purchaseOrderelement is defined directly in the definition of the element itself. Another thing we can do is define the type separately, and then in the element definition, point to the definition of the type. That’s what done here for the
The advantage of this is that it makes it possible to reuse types. So if we have multiple elements that have the same type, then we do not have to copy and paste the type definition, and it also makes the schema more readable. We can also define the element itself somewhere else, which is what we’ve done with the
Instead of a
nameattribute, it has a
refattribute and no type. The type is specified at the actual definition of the element somewhere else in the file. To indicate how often an element may appear in the XML, we can use the
minOccursset to 0, which means that it’s optional. The defaults for
maxOccursis 1. So if we omit these attributes, then the element must appear exactly once. Finally in the definition of the
purchaseOrderelement, it’s specified that the
purchaseOrderelement must have an
orderDateattribute. The type of this attribute is
xs:date, which is one of the built-in simple types of XML schema.
xs:dateis a date in
ISO-8601format, which means that it’s a year, month and date separated by dashes.
The type of
itemselement in a
Items. This type is a complex type, which is defined at the root level of the XSD. It contains a sequence of at least 0 item elements. Note that
maxOccursis set to unbounded, which is a special value to indicate that there’s no limit to the number of times this element may appear in the XML.
The type of the
itemelement is defined in-line here. It’s again a complex type that consists of a sequence of a few other elements,
productNameis just a string. Note that
xs:stringis another one of the built-in simple types. The
quantityelement has a slightly more elaborate simple type. It’s based on
xs:positiveInteger, which is, again, one of the built-in simple types, and it has a restriction added to it. The value must be less than
100. Then there is the
priceelement, which is of type
xs:decimalwhich is built-in simple type for numbers with a decimal digit. Finally, we allow an item to have a
comment, which is represented by the
commentelement, which we used before in the
CustomercomplexType has the name of the
buildingAddressand a loyalty element.
Loyaltysimple type looks like a Java enum. It’s a simple type based on
xs:stringthat has three possible enumeration values.
Finally, there is the definition of the
commentelement, which is a string with a maximum length of 1000 characters. That’s our simple XML schema for
Understanding the JAXB API
JAXB API is in the package
javax.xml.bind and related packages in Java SE. The entry point into the API is the class
JAXBContext. The first thing we need to do if we want to use the JAXB API is to get an instance of class
JAXBContext object will give us access to everything else in the API. We get an instance of
JAXBContext by calling one of the new instance static factory methods in the class itself.
When we have a
JAXBContext object, we can call a number of other factory methods on it to create other JAXB objects. The two most important ones are the
In JAXB terminology, converting from Java objects to XML is called marshalling. So when we want to write XML, what we need to do is create a Marshaller object, which has methods that we can call to marshall our Java objects into XML.
Vice versa, converting XML back to Java objects is called unmarshalling. When we want to read XML, we create an
Unmarshallerobject, which, of course, has methods for unmarshalling XML into Java objects.
Besides factory methods to create
Unmarshaller objects, class
JAXBContext has a few more methods to create
The reason that creating all these objects works via factory methods is because the JAXB API was designed to have multiple possible implementations. Besides the default implementation, which is included with Java SE, there are indeed other implementations available, for example,
EclipseLink MOXy. Reasons to use a different implementation of JAXB rather than the default are because a different implementation might offer extra features that are not part of the default or because of different implementation might have better performance.
There are one important thing to mention about
Unmarshaller objects. We should normally create a
JAXBContext object only once in our application and then reuse the same object whenever we need it. The
JAXBContext object is guaranteed to be thread-safe, so it’s safe to reuse the same instance for multiple threads. Creating a
JAXBContext object is a relatively heavy operation. So if we would do that every time our application needs it, then it will degrade the performance of our application.
Unmarshaller objects are not guaranteed to be thread-safe, so we should not use these objects for multiple threads. Creating
Unmarshaller objects are not heavy operations, so creating them when needed does not cause a performance problem.
Java code example
The sample code in this section will be put in this link.
@XmlRootElement: This annotation is used at the top level class to indicate the root element in the XML document. The
nameattribute in the annotation is optional. If not specified, the class name is used as the root XML element in the document.
@XmlAttribute: This annotation is used to indicate the attribute of the root element.
@XmlElement: This annotation is used on the properties of the class that will be the sub-elements of the root element.
- This annotation generates a wrapper element around XML representation.
- This is primarily intended to be used to produce a wrapper XML element around collections.
- This annotation can be used with the following annotations:
@XmlElementWrapperannotation can be used with the following program elements:
- JavaBean property
- non static, non transient field
@XmlType: define the order in which the fields are written in the XML file
@XmlTransient: annotate fields that we don’t want to be included in XML
@XmlElementRef: Maps a JavaBean property to a XML element derived from property’s type.
Refer some example in this link.
Benefits and drawbacks
- JAXB is fairly useful for many applications that need to work with XML. It’s especially useful if we have a more elaborate domain model because we do not need to write a lot of boilerplate code to convert our domain model objects from and to XML.
- Having an XSD that describes our domain model is also a good thing, especially if we use XML to exchange data with systems built by other people who need to know what our domain model looks like. We can then just give them our XSD.
- For writing XML, the low level APIs give us more precise control over what the XML looks like since they are closer to the XML itself. For example, normally, it should not matter to our application if text is in a CDATA section or represented in a different way in the XML since semantically the meaning of the XML is the same. But if for some reasons, it matters, the low level APIs will let us make the distinction, while JAXB tends to hide such details.
- When we need to deal with very large XML documents, the SAX or StAX APIs might be more suitable than DOM or JAXB since SAX and StAX do not require loading the complete document into memory.
JAXB 1.0was developed under the Java Community Process as
JAXB 2.0was released under
JSR 222and becomes part of JDK since
Java 6to add support for the
Web Services stack(under package
javax.xml.bind). It’s still part of standard JDK in
Java 9, the modules which contain Java EE technologies were deprecated for removal in a future release. The flag –add-modules=java.xml.bind can be used in
Java 10to resolve these modules.
JAXBhas been removed from
JDK(together with other JEE related modules based on
JEP 320) and we need to add it to the project as a separate library via
To get schema-to-java mapping in JAXB, refer link.