`

XML and Java - Parsing XML using Java Tutorial [转]

阅读更多

 

http://www.java-samples.com/showtutorial.php?tutorialid=152

 

Parsing XML
If you are a beginner to XML using Java then this is the perfect sample to parse a XML file create Java Objects and manipulate them.

The idea here is to parse the employees.xml file with content as below

<?xml version="1.0" encoding="UTF-8"?>
<Personnel>
  <Employee type="permanent">
        <Name>Seagull</Name>
        <Id>3674</Id>
        <Age>34</Age>
   </Employee>
  <Employee type="contract">
        <Name>Robin</Name>
        <Id>3675</Id>
        <Age>25</Age>
    </Employee>
  <Employee type="permanent">
        <Name>Crow</Name>
        <Id>3676</Id>
        <Age>28</Age>
    </Employee>
</Personnel>

From the parsed content create a list of Employee objects and print it to the console. The output would be something like


Employee Details - Name:Seagull, Type:permanent, Id:3674, Age:34.
Employee Details - Name:Robin, Type:contract, Id:3675, Age:25.
Employee Details - Name:Crow, Type:permanent, Id:3676, Age:28.

We will start with a DOM parser to parse the xml file, create Employee value objects and add them to a list. To ensure we parsed the file correctly let's iterate through the list and print the employees data to the console. Later we will see how to implement the same using SAX parser.
In a real world situation you might get a xml file from a third party vendor which you need to parse and update your database.

Using DOM
    This program DomParserExample.java uses DOM API.

The steps are

  • Get a document builder using document builder factory and parse the xml file to create a DOM object
  • Get a list of employee elements from the DOM
  • For each employee element get the id, name, age and type. Create an employee value object and add it to the list.
  • At the end iterate through the list and print the employees to verify we parsed it right.

a) Getting a document builder

	private void parseXmlFile(){
		//get the factory
		DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();

		try {

			//Using factory get an instance of document builder
			DocumentBuilder db = dbf.newDocumentBuilder();

			//parse using builder to get DOM representation of the XML file
			dom = db.parse("employees.xml");


		}catch(ParserConfigurationException pce) {
			pce.printStackTrace();
		}catch(SAXException se) {
			se.printStackTrace();
		}catch(IOException ioe) {
			ioe.printStackTrace();
		}
	}

b) Get a list of employee elements
Get the rootElement from the DOM object.From the root element get all employee elements. Iterate through each employee element to load the data.


	private void parseDocument(){
		//get the root element
		Element docEle = dom.getDocumentElement();

		//get a nodelist of 
 elements
		NodeList nl = docEle.getElementsByTagName("Employee");
		if(nl != null && nl.getLength() > 0) {
			for(int i = 0 ; i < nl.getLength();i++) {

				//get the employee element
				Element el = (Element)nl.item(i);

				//get the Employee object
				Employee e = getEmployee(el);

				//add it to list
				myEmpls.add(e);
			}
		}
	}

c) Reading in data from each employee.


	/**
	 * I take an employee element and read the values in, create
	 * an Employee object and return it
	 */
	private Employee getEmployee(Element empEl) {

		//for each <employee> element get text or int values of
		//name ,id, age and name
		String name = getTextValue(empEl,"Name");
		int id = getIntValue(empEl,"Id");
		int age = getIntValue(empEl,"Age");

		String type = empEl.getAttribute("type");

		//Create a new Employee with the value read from the xml nodes
		Employee e = new Employee(name,id,age,type);

		return e;
	}


	/**
	 * I take a xml element and the tag name, look for the tag and get
	 * the text content
	 * i.e for <employee><name>John</name></employee> xml snippet if
	 * the Element points to employee node and tagName is 'name' I will return John
	 */
	private String getTextValue(Element ele, String tagName) {
		String textVal = null;
		NodeList nl = ele.getElementsByTagName(tagName);
		if(nl != null && nl.getLength() > 0) {
			Element el = (Element)nl.item(0);
			textVal = el.getFirstChild().getNodeValue();
		}

		return textVal;
	}


	/**
	 * Calls getTextValue and returns a int value
	 */
	private int getIntValue(Element ele, String tagName) {
		//in production application you would catch the exception
		return Integer.parseInt(getTextValue(ele,tagName));
	}

d) Iterating and printing.


	private void printData(){

		System.out.println("No of Employees '" + myEmpls.size() + "'.");

		Iterator it = myEmpls.iterator();
		while(it.hasNext()) {
			System.out.println(it.next().toString());
		}
	}

Using SAX
This program SAXParserExample.java parses a XML document and prints it on the console.
Sax parsing is event based modelling.When a Sax parser parses a XML document and every time it encounters a tag it calls the corresponding tag handler methods

when it encounters a Start Tag it calls this method
    public void startElement(String uri,..

when it encounters a End Tag it calls this method
    public void endElement(String uri,...

Like the dom example this program also parses the xml file, creates a list of employees and prints it to the console. The steps involved are

  • Create a Sax parser and parse the xml
  • In the event handler create the employee object
  • Print out the data

Basically the class extends DefaultHandler to listen for call back events. And we register this handler with the Sax parser to notify us of call back events. We are only interested in start event, end event and character event.
In start event if the element is employee we create a new instant of employee object and if the element is Name/Id/Age we initialize the character buffer to get the text value.
In end event if the node is employee then we know we are at the end of the employee node and we add the Employee object to the list.If it is any other node like Name/Id/Age we call the corresponding methods like setName/SetId/setAge on the Employee object.
In character event we store the data in a temp string variable.

a) Create a Sax Parser and parse the xml


	private void parseDocument() {

		//get a factory
		SAXParserFactory spf = SAXParserFactory.newInstance();
		try {

			//get a new instance of parser
			SAXParser sp = spf.newSAXParser();

			//parse the file and also register this class for call backs
			sp.parse("employees.xml", this);

		}catch(SAXException se) {
			se.printStackTrace();
		}catch(ParserConfigurationException pce) {
			pce.printStackTrace();
		}catch (IOException ie) {
			ie.printStackTrace();
		}
	}

b) In the event handlers create the Employee object and call the corresponding setter methods.


	//Event Handlers
	public void startElement(String uri, String localName, String qName,
		Attributes attributes) throws SAXException {
		//reset
		tempVal = "";
		if(qName.equalsIgnoreCase("Employee")) {
			//create a new instance of employee
			tempEmp = new Employee();
			tempEmp.setType(attributes.getValue("type"));
		}
	}


	public void characters(char[] ch, int start, int length) throws SAXException {
		tempVal = new String(ch,start,length);
	}

	public void endElement(String uri, String localName,
		String qName) throws SAXException {

		if(qName.equalsIgnoreCase("Employee")) {
			//add it to the list
			myEmpls.add(tempEmp);

		}else if (qName.equalsIgnoreCase("Name")) {
			tempEmp.setName(tempVal);
		}else if (qName.equalsIgnoreCase("Id")) {
			tempEmp.setId(Integer.parseInt(tempVal));
		}else if (qName.equalsIgnoreCase("Age")) {
			tempEmp.setAge(Integer.parseInt(tempVal));
		}

	}

c) Iterating and printing.


	private void printData(){

		System.out.println("No of Employees '" + myEmpls.size() + "'.");

		Iterator it = myEmpls.iterator();
		while(it.hasNext()) {
			System.out.println(it.next().toString());
		}
	}

Generating XML
    The previous programs illustrated how to parse an existing XML file using both SAX and DOM Parsers.
But generating a XML file from scratch is a different story, for instance you might like to generate a xml file for the data extracted from a database.To keep the example simple this program XMLCreatorExample.java generates XML from a list preloaded with hard coded data. The output will be book.xml file with the following content.


<?xml version="1.0" encoding="UTF-8"?>
<Books>
    <Book Subject="Java 1.5">
        <Author>Kathy Sierra .. etc</Author>
        <Title>Head First Java</Title>
    </Book>
    <Book Subject="Java Architect">
        <Author>Kathy Sierra .. etc</Author>
        <Title>Head First Design Patterns</Title>
    </Book>
</Books>

The steps involved are
  • Load Data
  • Get an instance of Document object using document builder factory
  • Create the root element Books
  • For each item in the list create a Book element and attach it to Books element
  • Serialize DOM to FileOutputStream to generate the xml file "book.xml".

a) Load Data.



	/**
	 * Add a list of books to the list
	 * In a production system you might populate the list from a DB
	 */
	private void loadData(){

		myData.add(new Book("Head First Java",
			"Kathy Sierra .. etc","Java 1.5"));

		myData.add(new Book("Head First Design Patterns",
			"Kathy Sierra .. etc","Java Architect"));
	}

c) Getting an instance of DOM.

	/**
	 * Using JAXP in implementation independent manner create a document object
	 * using which we create a xml tree in memory
	 */
	private void createDocument() {

		//get an instance of factory
		DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
		try {
		//get an instance of builder
		DocumentBuilder db = dbf.newDocumentBuilder();

		//create an instance of DOM
		dom = db.newDocument();

		}catch(ParserConfigurationException pce) {
			//dump it
			System.out.println("Error while trying to instantiate DocumentBuilder " + pce);
			System.exit(1);
		}

	}

c) Create the root element Books.



	/**
	 * The real workhorse which creates the XML structure
	 */
	private void createDOMTree(){

		//create the root element 

		Element rootEle = dom.createElement("Books");
		dom.appendChild(rootEle);

		//No enhanced for
		Iterator it  = myData.iterator();
		while(it.hasNext()) {
			Book b = (Book)it.next();
			//For each Book object  create 
 element and attach it to root
			Element bookEle = createBookElement(b);
			rootEle.appendChild(bookEle);
		}

	}

d) Creating a book element.

	/**
	 * Helper method which creates a XML element 

	 * @param b The book for which we need to create an xml representation
	 * @return XML element snippet representing a book
	 */
	private Element createBookElement(Book b){

		Element bookEle = dom.createElement("Book");
		bookEle.setAttribute("Subject", b.getSubject());

		//create author element and author text node and attach it to bookElement
		Element authEle = dom.createElement("Author");
		Text authText = dom.createTextNode(b.getAuthor());
		authEle.appendChild(authText);
		bookEle.appendChild(authEle);

		//create title element and title text node and attach it to bookElement
		Element titleEle = dom.createElement("Title");
		Text titleText = dom.createTextNode(b.getTitle());
		titleEle.appendChild(titleText);
		bookEle.appendChild(titleEle);

		return bookEle;

	}

e) Serialize DOM to FileOutputStream to generate the xml file "book.xml".


	/**
	 * This method uses Xerces specific classes
	 * prints the XML document to file.
     */
	private void printToFile(){

		try
		{
			//print
			OutputFormat format = new OutputFormat(dom);
			format.setIndenting(true);

			//to generate output to console use this serializer
			//XMLSerializer serializer = new XMLSerializer(System.out, format);


			//to generate a file output use fileoutputstream instead of system.out
			XMLSerializer serializer = new XMLSerializer(
			new FileOutputStream(new File("book.xml")), format);

			serializer.serialize(dom);

		} catch(IOException ie) {
		    ie.printStackTrace();
		}
	}

Note:
The Xerces internal classes OutputFormat and XMLSerializer are in different packages.

In JDK 1.5 with built in Xerces parser they are under
com.sun.org.apache.xml.internal.serialize.OutputFormat
com.sun.org.apache.xml.internal.serialize.XMLSerializer

In Xerces 2.7.1 which we are using to run these examples they are under
org.apache.xml.serialize.XMLSerializer
org.apache.xml.serialize.OutputFormat
We are using Xerces 2.7.1 with JDK 1.4 and JDK 1.3 as the default parser with JDK 1.4 is Crimson and there is no built in parser with JDK 1.3.
Also please remember it is not advisable to use parser implementation specific classes like OutputFormat and XMLSerializer as they are only available in Xerces and if you switch to another parser in the future you may have to rewrite.

Instructions to run these programs


The instructions to compile and run these programs varies based on the JDK that you are using. This is due to the way the XML parser is bundled with various Java distributions.These instructions are for Windows OS.For Unix or Linux OS you just need to change the folder paths accordingly.

Using JDK 1.5

Xerces parser is bundled with the JDK 1.5 distribution.So you need not download the parser separately.

Running DOMParserExample

  1. Download DomParserExample.java, Employee.java, employees.xml to c:\xercesTest
  2. Go to command prompt and type
    cd c:\xercesTest
  3. To compile, type
    javac -classpath . DomParserExample.java
  4. To run, type
    java -classpath . DomParserExample

Running SAXParserExample

  1. Download SAXParserExample.java, Employee.java, employees.xml to c:\xercesTest
  2. Go to command prompt and type
    cd c:\xercesTest
  3. To compile, type
    javac -classpath . SAXParserExample.java
  4. To run,type
    java -classpath . SAXParserExample

Running XMLCreatorExample

  1. Download XMLCreatorExample.java, Book.java to c:\xercesTest
  2. Go to command prompt and type
    cd c:\xercesTest
  3. To compile, type
    javac -classpath . XMLCreatorExample.java
  4. To run, type
    java -classpath . XMLCreatorExample
分享到:
评论

相关推荐

    XML-Parser-2.4.4 官方源码

    2. **解析流程**:XML解析过程分为词法分析(Tokenization)和语法分析(Parsing)。词法分析将输入的XML文档分解成一系列的标记(Tokens),如元素(Element)、属性(Attribute)、文本(Text)等。语法分析则根据...

    mo-sql-parsing-9.294.22344.tar.gz

    《SQL解析库:mo-sql-parsing-9.294.22344详解》 SQL(Structured Query Language)是数据库管理的核心语言,它用于处理和操作数据。在IT行业中,高效、准确地解析SQL语句是数据库管理系统的重要组成部分。`mo-sql-...

    The Java API for XML Parsing Tutorial

    Sun - The Java API for XML Parsing Tutorial (JAXP) - 2001 - (By Laxxuss).chm

    XML_JAVA指南.rar_WORKING_java Tutorial_java xml_jaxp_xml

    在Java中,处理XML文档时,Java API for XML Parsing(JAXP)是一个核心工具集,它提供了在Java环境中解析XML的接口和类。 **XML的基本概念** XML文档由元素、属性、文本和注释组成。元素是XML文档的基本构建块,...

    前端开源库-parsing

    "前端开源库-parsing"着重关注的是解析技术,特别是基于JSON语法的解析器。JSON(JavaScript Object Notation)是一种轻量级的数据交换格式,因其易读易写、机器可读性强的特点,在网络数据传输中广泛应用。本文将...

    WSAM-06-Parsing_C#_sectionbke_源码.zip

    标题中的"WSAM-06-Parsing_C#_sectionbke_源码.zip"表明这是一个关于C#语言解析相关的源代码压缩包,很可能是某个课程或项目的一部分,编号为WSAM-06,主题聚焦在解析技术上。由于没有提供具体的标签,我们可以推测...

    SPL-1-Parsing-And-Analyzing-Primitive-Variables-in-Java-Source-Code:软件项目实验室(SPL)-01-java project source code

    "SPL-1-Parsing-And-Analyzing-Primitive-Variables-in-Java-Source-Code"是一个软件项目实验室(SPL)的子项目,旨在深入探讨如何在Java源代码中处理这些基本数据类型。这个项目可能包含了一系列的示例、工具或库,...

    cnau-AFChessG-parsing_yaml_configuration_file-0-g6e39db9.zip

    cnau-AFChessG-parsing_yaml_configuration_file-0-g6e39db9.zip

    the-file-parsing.rar_android

    本示例“the-file-parsing.rar”聚焦于Android平台上的文件解析,使用Java语言进行实现,这与Android开发的主流实践相吻合。由于Java是Android SDK的基础,因此掌握Java编程对于任何Android开发者来说都是必要的。 ...

    3D-3D-Semantic-Segmentation-for-Scene-Parsing.zip

    3D-3D-Semantic-Segmentation-for-Scene-Parsing.zip,基于特征提取和深度学习的三维语义实时分割新方法,3D建模使用专门的软件来创建物理对象的数字模型。它是3D计算机图形的一个方面,用于视频游戏,3D打印和VR,...

    purescript-parsing-dataview:ArrayBuffer输入流上的DataView支持purescript-parsing

    《Purescript-Parsing-Dataview:ArrayBuffer在Purescript解析中的应用》 在JavaScript的世界里,处理二进制数据是一项常见的任务,尤其是在网络通信、文件读写以及科学计算等领域。ArrayBuffer对象作为JavaScript...

    WSAM-06-Parsing_C#_sectionbke_

    标题“WSAM-06-Parsing_C#_sectionbke_”表明这是一份关于C#解析技术的文档,特别关注“sectionbke”部分。虽然描述“hasjdfhlajs hsjkd fa sjd ksjdfh lsjdf jds jksdhf lsd jshd sdjh”似乎没有提供任何实质信息,...

    Self-Correction-Human-Parsing:开箱即用的人类解析表示提取器

    人工解析的自我校正 开箱即用的人类解析表示提取器。 在第三项LIP挑战中,我们的解决方案在所有人工解析轨道(包括单个,多个和视频)中排名第一! 特征: 开箱即用的人类解析提取器,可用于其他下游应用程序。...

    biojava tutorial

    ### BioJava教程详解 #### 一、BioJava简介与应用场景 **BioJava**是一个开源的Java类库集合,旨在为生物序列数据分析的应用程序提供一个框架。它包含了丰富的接口和工具,用于处理生物学序列数据,例如DNA、RNA和...

    Android-XML-Parsing-Code-Sample:本课说明如何解析XML文档和使用其数据

    Gson库允许我们定义Java类,然后将XML数据映射到这些类的实例上,而Jackson库提供了`XMLMapper`类用于XML到Java对象的转换。 在实际项目中,确保正确处理XML解析异常也非常重要,如网络错误、XML格式错误等,需要...

    Java XML and JSON: Document Processing for Java SE, 2nd Edition

    Use this guide to master the XML metalanguage and JSON data format along with significant Java APIs for parsing and creating XML and JSON documents from the Java language. New in this edition is ...

    bottum-up-parsing.rar_UP

    本篇文章将深入探讨一种名为“自底向上解析”(Bottom-Up Parsing)的方法,及其在编程语言处理中的结构和应用。 自底向上解析,顾名思义,是从输入符号串的底层开始,逐步构造语法树直至达到文法的起始符号,从而...

    Processing XML with Java.pdf

    ### Processing XML with Java #### 知识点概览 本文档深入探讨了使用Java处理XML的方法和技术,旨在为读者提供全面且深入的理解。通过详细分析文档的标题、描述、标签及部分内容,我们可以提取出以下几个核心知识...

    XML-Parsing-using-NSXMLParse-in-Swift:NSXMLParse是XML解析的主要类。 NSXMLParse是XML解析器,内置类和SAX解析器。 它调用解析器委托方法

    使用NSXML进行XML解析 NSXMLParse是XML解析的主要类。 NSXMLParse是XML解析器,内置类和SAX解析器。 它调用解析器委托方法。 您可以在此处找到有关如何使用代码库的完整教程: 本教程由The App Guruz提出-最好的

Global site tag (gtag.js) - Google Analytics