Introduction to xml parsing with java
XML stands for Extensive Markup Language created for encoding the documents by following the defined set of rules which makes the content in the readable format while the XML parser is used for modifying or accessing the data in XML files by going through the whole XML file. In java, the XML parser is a standalone component that is used for going through the XML documents.
What is xml parsing with java?
XML parsers are responsible for checking and validating the format of your XML document by scanning throughout the XML file and also provides the functionality to access or modify the data in it. The most important part in the process of development of XML is XML parsing. In java, the XML parser is a standalone component of XML that helps in parsing DTD standalone files, XML document or even XML schema. This parsed XML document can be further processed by the user. The below figure illustrates the process of parsing the XML documents in Java –
https://docs.oracle.com/cd/B25016_08/doc/dl/web/B14033_01/adxdk002.gif
We can observe in the above figure that the input provided to parser includes the main XML document and other optional things that can be provided are schema files and DTD. Whatever output is generated from the parser is further passed to the DOM or SAX parser whichever you might be using as an input. The DOM parser or SAX parser also receives the XSL stylesheet file used for designing and beautifying the data. Further, the commands that are of parsed XSL along with the XML that is parsed are forwarded to XSLT processor that ultimately produces the transformed XML document as its output.
Steps to Using xml parsing with java
When parsing the XML documents, we need to follow the following steps listed below –
- All the packages that are related to XML should be imported right in the beginning.
- New instance of DocumentBuilder should be created.
- From the available stream or file, you should create a document
- The root element should be extracted.
- Attributes and sub-elements should be examined.
Java XML Parser – DOM
Document Object Model (DOM) is a tree-based application programming interface creates a tree structure or representation inside the memory for the corresponding XML document passed to it. DOM comes along with the methods and classes that can be used for processing the tree and navigating through it inside the application.
Interface of DOM is the most useful component of XML tree which can be used for manipulations of structures. These manipulations involve deleting or adding old or new attributes and elements, element reordering, renaming the existing elements and many other.
DOM API is preferred for using when we need the random access of elements inside our application. DOM can also be used inside the XSL transformation tasks or while giving a call to XPath. In short, we can say that whenever there is a requirement to use the iterations in tree and need to scan or visit through the whole document then we can go for using the DOM API. It is also possible to customize the tree building process in DOM. In order to reduce the size of the pipes in XML documents, we can make more use of attributes instead of elements in DOM API
Create a DocumentBuilder- xml parsing with java
For creating an instance of document builder, we need to have the object of document builder factory class which can be created by using the code snippet shown below –
DocumentBuilderFactory instanceOfFactory =
DocumentBuilderFactory.newInstance();
DocumentBuilder instanceOFBuilder = instanceOfFactory.newDocumentBuilder();
Demo Example- xml parsing with java
Let us now consider one example that will help us to understand the implementation of all the steps mentioned above in “Steps to Using xml parsing with java” section that will make the understanding of Dom parser for XML document parsing clearer. We will have two files out of which one will be the main XML document which will be the input file for parsing and the other one will be java file which will parse the file and generate the output by accessing all the nodes of tree formed from XML file.
The file which needs to parse will be the XML file shown below –
xml file –
<?xml version = "1.0"?>
<class>
<article articleNo = "393">
<topic>Android Auto</topic>
<nameOfAuthor>Payal</nameOfAuthor>
<genre>Android Auto</genre>
<numberOfPages>85</numberOfPages>
</article>
<article articleNo = "493">
<topic>PostgreSQL</topic>
<nameOfAuthor>Mayur</nameOfAuthor>
<genre>Database</genre>
<numberOfPages>95</numberOfPages>
</article>
<article articleNo = "593">
<topic>MySQL</topic>
<nameOfAuthor>Meera</nameOfAuthor>
<genre>DBMS</genre>
<numberOfPages>90</numberOfPages>
</article>
</class>
Now, we need to write a class in java which can write the business logic as well as try to parse the XML file using DOM parser whose name is EducbaDomParserExample and the contents of the file are as shown below –
EducbaDomParserExample.java
package com.educba.xml;
import java.io.File;
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.parsers.DocumentBuilder;
import org.w3c.dom.Document;
import org.w3c.dom.NodeList;
import org.w3c.dom.Node;
import org.w3c.dom.Element;
public class EducbaDomParserExample {
public static void main(String[] args) {
try {
File xmlFileToParse = new File("educbaXML.txt");
DocumentBuilderFactory instanceOfDocBuilderFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder educbaDocBuilderObj = instanceOfDocBuilderFactory.newDocumentBuilder();
Document sampleDocument = educbaDocBuilderObj.parse(xmlFileToParse);
sampleDocument.getDocumentElement().normalize();
System.out.println("Element present at the root of the tree :" + sampleDocument.getDocumentElement().getNodeName());
NodeList nodeListInstance = sampleDocument.getElementsByTagName("article");
System.out.println("__________________________");
for (int temporaryVar = 0; temporaryVar < nodeListInstance.getLength(); temporaryVar++) {
Node singleNode = nodeListInstance.item(temporaryVar);
System.out.println("\nElement Being Traversed :" + singleNode.getNodeName());
if (singleNode.getNodeType() == Node.ELEMENT_NODE) {
Element particulerElement = (Element) singleNode;
System.out.println("Article Number : "
+ particulerElement.getAttribute("articleNo"));
System.out.println("Topic: "
+ particulerElement
.getElementsByTagName("topic")
.item(0)
.getTextContent());
System.out.println("Author Name : "
+ particulerElement
.getElementsByTagName("nameOfAuthor")
.item(0)
.getTextContent());
System.out.println("Genre: "
+ particulerElement
.getElementsByTagName("genre")
.item(0)
.getTextContent());
System.out.println("Number Of Pages : "
+ particulerElement
.getElementsByTagName("numberOfPages")
.item(0)
.getTextContent());
}
}
} catch (Exception sampleException) {
sampleException.printStackTrace();
}
}
}
The output of the execution of the above java file will produce the result shown in the below image –
Conclusion
The XML file or document which is an extensible markup language file will be parsed in java application by using various types of parsers such as DOM parser, SAX parser, etc. In this article, we saw, how we can convert that XML file into a tree-like structure showing all the hierarchies inside the memory and how we can access or modify each and individual element or attribute in java program.
Recommended Articles
This is a guide to xml parsing with java. Here we discuss how we can convert that XML file into a tree-like structure showing all the hierarchies inside the memory. You may also have a look at the following articles to learn more –