Table of Contents

Mycila XML Tool
- Maven Repository
- Documentation

Mycila XML Tool

XMLTool is a very simple Java library to be able to do all sorts of common operations with an XML document. As a Java developer, I often end up writing the always the same code for processing XML, transforming, ... So i decided to put all in a very easy to use class using the Fluent Interface pattern to facilitate XML manipulations.

XMLTag tag = XMLDoc.newDocument(false)
    .addDefaultNamespace("http://www.w3.org/2002/06/xhtml2/")
    .addNamespace("wicket", "http://wicket.sourceforge.net/wicket-1.0")
    .addRoot("html")
    .addTag("wicket:border")
    .gotoRoot().addTag("head")
    .addNamespace("other", "http://other-ns.com")
    .gotoRoot().addTag("other:foo");
System.out.println(tag.toString());

Features

With XML Tool you will be able to quickly:

Create new XML documents from external sources or new document from scrash
Manage namespaces
Manipulating nodes (add, remove, rename)
Manipulating data (add, remove text or CDATA)
Navigate into the document with shortcuts and XPath (note: XPath supports namespaces)
Tranform an XMlDoc instance to a String or a Document
Validate your document against schemas
Executin callbacks on a hierarchy
Remove all namspaces (namespace ignoring)
... and a lot of other features !

Project status

Issues: https://github.com/mathieucarbou/xmltool/issues
OSGI Compliant: <img width="100px" src="http://www.sonatype.com/system/images/W1siZiIsIjIwMTMvMDQvMTIvMTEvNDAvMzcvMTgzL05leHVzX0ZlYXR1cmVfTWF0cml4X29zZ2lfbG9nby5wbmciXV0/Nexus-Feature-Matrix-osgi-logo.png" title="OSGI Compliant"></img>

Maven Repository

Releases

Available in Maven Central Repository: http://repo1.maven.org/maven2/com/mycila/mycila-xmltool/

Snapshots

Available in OSS Repository: https://oss.sonatype.org/content/repositories/snapshots/com/mycila/mycila-xmltool/

Maven dependency

<dependency>
    <groupId>com.mycila</groupId>
    <artifactId>mycila-xmltool</artifactId>
    <version>X.Y.ga</version>
</dependency>

Maven sites

[4.0.ga] (http://oss.carbou.me/xmltool/reports/4.0.ga/index.html)

Documentation

Performance consideration

XML Tool uses the Java DOM API and Document creation has a cost. Thus, to improve peformance, XML Tool uses 2 Object pools of DocumentBuilder instances:

one pool for namespace-aware document builders
another one ignoring namespaces

You can configure the pools by using XMLDocumentBuilderFactory.setPoolConfig(config)

By default, each of the 2 pools have the following configuration:

min idle = 0
max idle = CPU core number
max total = CPU core number * 4
max wait time = -1

If your application is heavily threaded and a lot of threads are using XMLTag concurrently, to avoid thread contention you might want to increase the max total to match your peak thread count and max idle to match your average thread count.

If your application does not use a lot of thread and often create documents, you could probably lower those numbers.

The goal is to have sufficient DocumentBuilder instances available in the pool to be able to "feed" your application as demand without waiting for these objects to become available.

Using an object pool is sure much more complicated, but it will prevent any threading issues and also maximize performance because of object reuse.

Creating XML documents

Creating a new XML document

The newDocument method crate a new XML document. You then have to choose a default namespace if you want and then choose the root name of the document.

System.out.println(XMLDoc.newDocument(true).addRoot("html").toString());

gives:

<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<html/>

Loading an existing XML document

The from methods can load an XML document from any of the following types:

org.w3c.dom.Node
InputSource
Reader
InputStream
File
URL
String
javax.xml.transform.Source

Example:

URL yahooGeoCode = new URL("http://local.yahooapis.com/MapsService/V1/geocode?appid=YD-9G7bey8_JXxQP6rxl.fBFGgCdNjoDMACQA--&state=QC&country=CA&zip=H1W3B8");
System.out.println(XMLDoc.from(yahooGeoCode, true).toString());
System.out.println(XMLDoc.from(yahooGeoCode, true).getText("Result/City"));

outputs:

<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<ResultSet xmlns="urn:yahoo:maps" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="urn:yahoo:maps http://api.local.yahoo.com/MapsService/V1/GeocodeResponse.xsd">
<Result precision="zip">
    <Latitude>45.543289</Latitude>
    <Longitude>-73.543098</Longitude>
    <Address/>
    <City>Montreal</City>
    <State>QC</State>
    <Zip>H1W 3B8</Zip>
    <Country>CA</Country>
</Result>
</ResultSet>
<!-- ws04.search.re2.yahoo.com uncompressed Tue Dec  9 13:39:12 PST 2008 -->

Montreal

Ignoring namespaces

All creational methods XMLDoc.newDocument and XMLDoc.from requires a boolean attribute ignoreNamespaces. If this attribute is set to true, all namespaces in the document are ignored. This is really useful if you use XPath a lot since you can avoid prefixing all your XPath elements.

Example:

System.out.println(XMLDoc.newDocument(true)
    .addDefaultNamespace("http://www.w3.org/2002/06/xhtml2/")
    .addRoot("html"));
System.out.println(XMLDoc.newDocument(false)
    .addDefaultNamespace("http://www.w3.org/2002/06/xhtml2/")
    .addRoot("html"));

outputs:

<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<html/>

<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<html xmlns="http://www.w3.org/2002/06/xhtml2/"/>

Navigating in a document with namespaces using XPath is quite a pain:

doc.gotoTag("ns0:body").addTag("child")
   .gotoParent().addCDATA("with special characters")
   .gotoTag("ns0:body").addCDATA("<\"!@#$%'^&*()>")

whereas if you load the same document with ignoreNamespaces, you can simply navigate like this when you use XPath:

doc.gotoTag("body").addTag("child")
   .gotoParent().addCDATA("with special characters")
   .gotoTag("body").addCDATA("<\"!@#$%'^&*()>")

Using namespaces

When you create or load a document, and if you decide to not ignore namespaces, you can add a default namespace for your document and add other ones after. Namespace management is quite a challenge, specifically when using XPath. When you have an XMLTag instance, you have access to the following methods to manage namespaces in the document:

Adding and retrieving namespaces and prefixes

addDefaultNamespace

When you create an empty document, you can define a default namespace to use for the document. In example:

XMLTag doc = XMLDoc.newDocument()
    .addDefaultNamespace("http://www.w3.org/2002/06/xhtml2/")
    .addRoot("html");

will produce:

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<html xmlns="http://www.w3.org/2002/06/xhtml2/"/>

addNamespace

When you obtained an XMLTag instance, you can add any namespace you want. In example:

XMLTag doc = XMLDoc.newDocument()
    .addDefaultNamespace("http://www.w3.org/2002/06/xhtml2/")
    .addNamespace("wicket", "http://wicket.sourceforge.net/wicket-1.0")
    .addRoot("html")
    .addTag("wicket:border")
    .gotoRoot().addTag("head")
    .addNamespace("other", "http://other-ns.com")
    .gotoRoot().addTag("other:foo");

will produce:

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<html xmlns="http://www.w3.org/2002/06/xhtml2/">
    <wicket:border xmlns:wicket="http://wicket.sourceforge.net/wicket-1.0"/>
    <head/>
    <other:foo xmlns:other="http://other-ns.com"/>
</html>

Namespace prefix generation

When you load an existing XML document, or when you define a default namespace in a new document, prefixes and namespaces are automatically found in the whole document. Often, XML documents have default namespace. This is often the case for example in XHTML documents, like below. For this case, XMLDoc will generate for you a prefix that you can use for XPath navigation, and register the namespace as being the default one.

In example, the following document will have a default namespace and also a prefix generated to access it: ns0.

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<html xmlns="http://www.w3.org/1999/xhtml" lang="en">
    <head>
        <title/>
    </head>
    <body/>
</html>

XMLTag doc = XMLDoc.from(...);
assertEquals(doc.getPrefix("http://www.w3.org/1999/xhtml"), "ns0");
assertEquals(doc.getContext().getNamespaceURI("ns0"), "http://www.w3.org/1999/xhtm

Xmltool

Install / Use

README