Wednesday, April 8, 2009

Efficient XML is in fact efficient!

pencil icon, that"s clickable to start editing the post

In the beginning of 2008 I wrote on "W3C drafts on Efficient XML Interchange, EXI - I didn't say binary XML". The specification is now a last call working draft from 19 September 2008. There has been both implementation and a test suite developed to test whether it in fact is efficient - and if you browse through their "Efficient XML Interchange Evaluation" you'll discover that in fact it is. They have also done a "Efficient XML Interchange (EXI) Impacts" analysis, and this conclusion is also positive:

EXI has been designed to be compatible with XML and can be introduced into the existing family of XML technologies without immediate disruption to XML-using applications. However, with certain modifications to existing XML-related specifications in the future it may be possible to achieve additional benefits when using EXI, still without disruption to existing XML-based applications. Furthermore, in a multi-application system where only some applications adopt EXI, sending EXI data to the other applications can potentially cause disruption, so care is needed to account for differing format support among the participating applications.

In terms of efficiency they have done various tests. I haven't looked into the test cases nor the technologies used for comparison but hope it's done as serious as the document looks like. They hav benchmarked it to existing technologies like Gzipped XML and ASN.1 for compactness.

...EXI is consistently smaller than gzipped XML regardless of document size, document structure or the availability of schema information. In some cases, EXI is over 10 times smaller than gzip. In addition, EXI works well in cases where gzip has little effect or even makes documents bigger, such as high volume streams of small messages typical of geolocation, financial exchange and sensor applications.

and

... Each EXI encoded file is smaller than the equivalent ASN.1 PER, and sometimes 20 times smaller. This holds true even for cases where EXI is preserving XML comments, processing instructions and namespace prefixes that are not preserved by ASN.1 PER. In addition, EXI works well in cases where ASN.1 PER actually increases the size of the document or fails to produce an encoding at all (e.g., due to schema deviations.)

Size is one thing but it could come at a processing cost, so they've tested that as well

The average decoding speed of EXI was 14.5 times faster than the average decoding speed of XML. The median speed increase was 6.7 times faster. To improve readibility, the graph does not show the four best cases, which ranged from 54 times faster to 257 times faster. These four test cases were SOAP web-service messages that were marshalled from a binding layer and contained repeating structures with elements and attributes from several different namespaces. As is typical for such use cases, the repeated structures contained a large number of repeated namespace declarations. EXI eliminates most of the overhead associated with namespace processing, which is why EXI achieved such a speed increase for these cases.

I guess these namespace declarations could be optimized just placing them at the root level, but I'm not sure if this would speed up the processing (and that i hurt that bad).

The graph above shows EXI decoding speed with compression compared to XML with compression. The average decoding speed of EXI was 9.2 times faster than the average decoding speed of GZipped XML. The median speed was 4.4 times faster. ...

Summary

I sure looks like EXI is comming to an end with a final working draft and test suite that ran with success. It will be interesting to see if it actually picks up speed and when? I haven't found any open source implementations yet. EXI may loose out due to not but good enough but from competition from Advanced Message Queuing Protocol, that goes for the same kind of nails and maybe fits a larger user base of classic messaging.

2 comments :

Anonymous said...

There is now one open source implementation of EXI: http://exificient.sourceforge.net/

Sweetxml said...

Hi

Thanks for the link.

Brgds Brian