Introduction to Java mapping
In this Wiki i would like to explain the various aspects of DOM and SAX parsing and their performance issues in a simple way that can help us in going much detail into Java Mapping.
Mapping is the way of transforming one XML structure to another XML Structure based on the business rules. As a part of it we do certain operations like breaking child nodes and attaching them to its parent node and more in an XML structure.
In XI/PI we have the following mapping techniques
1. Graphical Mapping
2. ABAP Mapping
3. JAVA Mapping and
4. XSLT Mapping.
Among all the above mapping techniques Java mappings improves the performance and is a preferred as they are executed on J2EE engine directly. But in case of graphical mapping XI/PI converts the mapping program into executable Java code internally based on graphical design and executes it on J2EE engine making it comparatively less efficiency. Java mappings are more useful when performance of integration server is a concern during runtime. Java mappings are parsing programs that can be developed in NWDS, import as .jar files and make use of in the mapping part of Integration Repository. NWDS provides suitable Java environment and higher level tools to parse XML documents through the Simple API for XML (SAX) and the Document Object Model (DOM) interfaces. The SAX and DOM parsers are standards that are implemented in several different languages. In the Java programming language, you can instantiate the parsers by using the Java API for XML Parsing (JAXP).
Java mapping can be done in two ways
1. DOM Parsing
2. SAX Parsing
Parsing is a technique of reading and identifying XML tags from an XML document. DOM and SAX can either be a validating or a non validating parser.
A validating parser checks the XML file against the rules imposed by DTD or XML Schema, a non validating parser doesn't validate the XML file against a DTD or XML Schema. Both Validating and non validating parser check for the wellformedness of the XML document.
At the core of the DOM API are the Document and Node interfaces. A Document is a top level object that represents an XML document. The Document holds the data as a tree of Nodes, where a Node is a base type that can be an element, an attribute, or some other type of content. The Document also acts as a factory for new Nodes. Nodes represent a single piece of data in the tree, and provide all of the popular tree operations. You can query nodes for their parent, their siblings, or their children. You can also modify the document by adding or removing Nodes.
The Dom tree is composed of nodes and they are of 12 types:
SAX (Simple API for XML) is an event-driven model for processing the XML. SAX model is quite different from DOM, rather than building a complete representation of the document SAX parser fires off a series of events as it reads the document from beginning to end. Those events are passed to event handlers, which provide access to the contents of the document.
There are three classes of event handlers: DTD Handlers, for accessing the contents of XML Document-Type Definitions;
Error Handlers, for low-level access to parsing errors;
Document Handlers, by far the most often used, for accessing the contents of the document.
A SAX processor will pass the following events to a Document Handler:
- The start of the document.
- A processing instruction element.
- A comment element.
- The beginning of an element, including that element's attributes.
- The text contained within an element.
- The end of an element.
- The end of the document.
Advantages and Disadvantages
1. It is good when random access to widely separated parts of a document is required.
2. It supports both read and write operations
1. It is memory inefficient. Because DOM consumes more memory to construct the XML tree Object in the memory corresponding to the input XML its not advisable to use for parsing large XML documents, in that case SAX is preferred over DOM.
1. It is simple to program
2. It is memory efficient as SAX parser does not keep any of the document tree in memory.
The data is broken into pieces and clients never have all the information as a whole unless they create their own data structure.
Differences between SAX and DOM parser at a glance
1.Parses node by node
2. Doesn't store the XML in memory
1. Stores the entire XML document