|
|
(6 intermediate revisions by the same user not shown) |
Line 1: |
Line 1: |
− | =VTD-XML Investigation=
| |
| | | |
− | VTD-XML ([http://vtd-xml.sourceforge.net/ http://vtd-xml.sourceforge.net/]) is a high-performance XML processing model that deals with XML in a binary form, instead of the traditional text form. VTD stands for '''V'''irtual '''T'''oken '''D'''escriptor.
| |
− |
| |
− | VTD-XML parses an XML document and builds an internal data structure representing the entire XML document in <tt>byte[]</tt> form. Each "token" of the XML document is represented as the following 64-bit integer:
| |
− |
| |
− | [[Image:Vtd_layout.jpg]]
| |
− | * Big endian
| |
− | * Starting offset: 30 bits (b29 ~ b0) maximum value is 2^30 -1 = 1G -1
| |
− | * Length: 20 bits (b51 ~ b32) maximum value is 2^20-1 = 1M -1
| |
− | ** For some token type
| |
− | *** Prefix length: 9 bits (b51~ b43) max value 511
| |
− | *** Q-name length: 11 bits (b42 ~ b 32) max value 1023
| |
− | * Depth: 8 bits (b59~b52) max value is 2^8-1 = 255
| |
− | * Token type: 4 bits (b63~b60)
| |
− | * Reserved bit: 2 bits (b31: b30)
| |
− |
| |
− |
| |
− | ==VTD-XML Core Concepts==
| |
− |
| |
− |
| |
− | ===Generating a VTD-XML Representation of the XML Document (Unmarshal)===
| |
− |
| |
− | <div style="width:900px">
| |
− | <source lang="java">
| |
− | VTDGen vg = new VTDGen();
| |
− |
| |
− | // from existing byte[]
| |
− | // true indicates namespace aware
| |
− | vg.setDoc(byte[]);
| |
− | vg.parse(true);
| |
− |
| |
− | // - or -
| |
− |
| |
− | // from file
| |
− | vg.parseFile("old.xml", false);
| |
− | </source>
| |
− | </div>
| |
− |
| |
− |
| |
− | ===Writing a VTD-XML Document (Marshal)===
| |
− |
| |
− | <div style="width:900px">
| |
− | <source lang="java">
| |
− | VTDGen vg = new VTDGen();
| |
− | vg.parseFile("old.xml", false);
| |
− | VTDNav vn = vg.getNav();
| |
− |
| |
− | XMLModifier xm = new XMLModifier();
| |
− | xm.bind(vn);
| |
− |
| |
− | // ...
| |
− |
| |
− | // Write to OutputStream
| |
− | xm.output(new FileOutputStream("new.xml"));
| |
− | </source>
| |
− | </div>
| |