What is DOM

DOM is an Object representation of an XML, HTML or XHTML document. In this tutorial we will be dealing with only XML. DOM represents the XML as a Document tree. JAXP provides API for DOM implementation in Java. It also provides parsing interface which can be used to plugin different parsers (JAXP provides a default implementation)


  • org.w3c.dom – Contains classes that are DOM representation of an XML Document and its components. Classes include :
    • Document – Represents an entire XML or HTML Document. It is the root of the Document tree.
    • Element – Represents an element in an XML or HTML Document. It has methods to access the attributes of an xml element.
    • Attribute – Represents an attribute in an Element object.
    • CDATASection – Represents CDATA Section. These are blocks of text that can contain characters that are normally part of markup.
    • Text – Represents textual content of an element or an Attribute. If the text does not contain markup then all text is contain in a single node, if it contains markup then the various elements are added as children of the Text element.
    • Processing Instruction – Represents a Processing Instruction in an XML document.
    • Comment – Represents a comment in an XML Document. Contains comment text.
  • javax.xml.parsers – Contains interfaces that the DOM and SAX Parsers need to implement :
    • DocumentBuilderFactory – Defines a factory that can be used to obtain DOM parsers
    • DocumentBuilder – Defines interface methods that can be used to obtain a DOM Object tree from an XML Document

JAXP DOM in action

Lets now see an example of a DOM representation of an XML document. In this example we look at the following:

  • Parsing the XML using the default DOM Parser.
  • Obtaining the root element
  • Obtaining all elements with a specific name
  • Obtaining all elements with a specific name and in a specific namespace
  • Iterating through all child nodes and parsing through them.
package com.studytrails.xml.jaxp;


import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.parsers.ParserConfigurationException;

import org.w3c.dom.Document;
import org.w3c.dom.Element;
import org.w3c.dom.NamedNodeMap;
import org.w3c.dom.Node;
import org.w3c.dom.NodeList;
import org.xml.sax.SAXException;

public class JaxpDOMExample1 {

	private static String xmlSource = "";

	public static void main(String[] args) throws ParserConfigurationException, SAXException, IOException {
		JaxpDOMExample1 example = new JaxpDOMExample1();


	void startParsing() throws ParserConfigurationException, SAXException, IOException {

		// create the factory for the DocumentBuilder. JAXP ships with a xerces
		// as the default DOM parser.
		DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
		// prints class
		// we want the factory to be namespace aware. This is important if the
		// XML declares and uses additional namespaces
		// the actual builder or parser
		DocumentBuilder builder = factory.newDocumentBuilder();

		// the Document that represents the XML
		Document bbcDoc = builder.parse(xmlSource);

		// the root element.
		Element rootElement = bbcDoc.getDocumentElement();
		// prints rss

		// search for an element using the name
		NodeList list = rootElement.getElementsByTagName("channel");
		// get the first item in the list
		Node channel = list.item(0);
		// get the child nodes
		NodeList channelChildren = channel.getChildNodes();
		int length = channelChildren.getLength();
		for (int i = 0; i < length; i++) {
			Node node = channelChildren.item(i);
			// node type 1 is text
			if (1 == node.getNodeType()) {
				if ("title".equals(node.getNodeName()))
					// the text element is the child node

		// get all elements with the name 'link'. We just print the first link
		NodeList linkList = rootElement.getElementsByTagName("link");
		// &ltatom:link href=""
		// rel="self" type="application/rss+xml"/&gt

		// get all elements with the name 'link' and in a specific namespace
		NodeList linkList2 = rootElement.getElementsByTagNameNS("", "link");
		Node atomLink = linkList2.item(0);
		System.out.println(atomLink.hasAttributes()); // prints true
		NamedNodeMap atomLinkAttributes = atomLink.getAttributes();
		for (int i = 0; i < atomLinkAttributes.getLength(); i++) {
			Node atomLinkAttribute = atomLinkAttributes.item(i);
			 * href 
			 * rel
			 * self

		Node firstChildOfRoot = rootElement.getFirstChild();
		// prints #text

		Node siblingOfFirstChild = firstChildOfRoot.getNextSibling();
		// prints channel



