最近一直在学习XML的Xpath解析方式,据说是一个很简单的遍历XML文件的工具,类似于SQL和Oracle的关系,但是找了很多都没有找到关于Java的Xpath代码,有的都是把W3School上的文档拷贝过来的,自己也尝试过去用Java去实现遍历,但是发现有的解释不理解,直到看到了这边外国人写的博客,让我瞬间明白了,真的感谢这位哥们。。。
下面是他的原文,我测试过几个列子,都是OK了,大家都懂英文,我就没有必要再翻译过来了,呵呵。
XPath is a language for finding information in an XML file. You can say that XPath is (sort of) SQL for XML files. XPath is used to navigate through elements and attributes in an XML document. You can also use XPath to traverse through an XML file in Java.
XPath comes with powerful expressions that can be used to parse an xml document and retrieve relevant information.
For demo, let us consider an xml file that holds information of employees.
<?xml version="1.0"?> <Employees> <Employee emplid="1111" type="admin"> <firstname>John</firstname> <lastname>Watson</lastname> <age>30</age> <email>johnwatson@sh.com</email> </Employee> <Employee emplid="2222" type="admin"> <firstname>Sherlock</firstname> <lastname>Homes</lastname> <age>32</age> <email>sherlock@sh.com</email> </Employee> <Employee emplid="3333" type="user"> <firstname>Jim</firstname> <lastname>Moriarty</lastname> <age>52</age> <email>jim@sh.com</email> </Employee> <Employee emplid="4444" type="user"> <firstname>Mycroft</firstname> <lastname>Holmes</lastname> <age>41</age> <email>mycroft@sh.com</email> </Employee> </Employees>
I have saved this file at path C:\employees.xml
. We will use this xml file in our demo and will try to fetch useful information using XPath. Before we start lets check few facts from above xml file.
- There are 4 employees in our xml file
- Each employee has a unique employee id defined by attribute
emplid
- Each employee also has an attribute
type
which defines whether an employee is admin or user. - Each employee has four child nodes:
firstname
,lastname
,age
andemail
- Age is a number
Let’s get started…
1. Learning Java DOM Parsing API
In order to understand XPath, first we need to understand basics of DOM parsing in Java. Java provides powerful implementation of domparser in form of below API.
1.1 Creating a Java DOM XML Parser
First, we need to create a document builder using DocumentBuilderFactory
class. Just follow the code. It’s pretty much self explainatory.
import javax.xml.parsers.DocumentBuilder; import javax.xml.parsers.DocumentBuilderFactory; import javax.xml.parsers.ParserConfigurationException; //... DocumentBuilderFactory builderFactory = DocumentBuilderFactory.newInstance(); DocumentBuilder builder = null; try { builder = builderFactory.newDocumentBuilder(); } catch (ParserConfigurationException e) { e.printStackTrace(); }
1.2 Parsing XML with a Java DOM Parser
Once we have a document builder object. We uses it to parse XML file and create a document object.
import org.w3c.dom.Document; import java.io.IOException; import org.xml.sax.SAXException; //... try { Document document = builder.parse( new FileInputStream("c:\\employees.xml")); } catch (SAXException e) { e.printStackTrace(); } catch (IOException e) { e.printStackTrace(); }
In above code, we are parsing an XML file from filesystem. Sometimes you might want to parse XML specified as String value instead of reading it from file. Below code comes handy to parse XML specified as String.
String xml = ...; Document xmlDocument = builder.parse(new ByteArrayInputStream(xml.getBytes()));
1.3 Creating an XPath object
Once we have document object. We are ready to use XPath. Just create an xpath object using XPathFactory.
import javax.xml.xpath.XPath; import javax.xml.xpath.XPathFactory; //... XPath xPath = XPathFactory.newInstance().newXPath();
1.4 Using XPath to parse the XML
Use xpath object to complie an XPath expression and evaluate it on document. In below code we read email address of employee having employee id = 3333. Also we have specified APIs to read an XML node and a nodelist.
String expression = "/Employees/Employee[@emplid='3333']/email"; //read a string value String email = xPath.compile(expression).evaluate(xmlDocument); //read an xml node using xpath Node node = (Node) xPath.compile(expression).evaluate(xmlDocument, XPathConstants.NODE); //read a nodelist using xpath NodeList nodeList = (NodeList) xPath.compile(expression).evaluate(xmlDocument, XPathConstants.NODESET);
2. Learning XPath Expressions
As mentioned above, XPath uses a path expression to select nodes or list of node from an xml document. Heres a list of useful paths and expression that can be used to select any node/nodelist from an xml document.
nodename |
Selects all nodes with the name “nodename” |
/ |
Selects from the root node |
// |
Selects nodes in the document from the current node that match the selection no matter where they are |
. |
Selects the current node |
.. |
Selects the parent of the current node |
@ |
Selects attributes |
employee |
Selects all nodes with the name “employee” |
employees/employee |
Selects all employee elements that are children of employees |
//employee |
Selects all book elements no matter where they are in the document |
Below list of expressions are called Predicates. The Predicates are defined in square brackets [ ... ]. They are used to find a specific node or a node that contains a specific value.
/employees/employee[1] |
Selects the first employee element that is the child of the employees element. |
/employees/employee[last()] |
Selects the last employee element that is the child of the employees element |
/employees/employee[last()-1] |
Selects the last but one employee element that is the child of the employees element |
//employee[@type='admin'] |
Selects all the employee elements that have an attribute named type with a value of ‘admin’ |
There are other useful expressions that you can use to query the data.
Read this w3school page for more details: http://www.w3schools.com/xpath/xpath_syntax.asp
3. Examples: Query XML document using XPath
Below are few examples of using different expressions of xpath to fetch some information from xml document.
3.1 Read firstname of all employees
Below expression will read firstname
of all the employees.
String expression = "/Employees/Employee/firstname"; System.out.println(expression); NodeList nodeList = (NodeList) xPath.compile(expression).evaluate(xmlDocument, XPathConstants.NODESET); for (int i = 0; i < nodeList.getLength(); i++) { System.out.println(nodeList.item(i).getFirstChild().getNodeValue()); }
Output:
John Sherlock Jim Mycroft |
3.2 Read a specific employee using employee id
Below expression will read employee information for employee with emplid = 2222. Check how we used API to retrieve node information and then traveresed this node to print xml tag and its value.
String expression = "/Employees/Employee[@emplid='2222']"; System.out.println(expression); Node node = (Node) xPath.compile(expression).evaluate(xmlDocument, XPathConstants.NODE); if(null != node) { nodeList = node.getChildNodes(); for (int i = 0;null!=nodeList && i < nodeList.getLength(); i++) { Node nod = nodeList.item(i); if(nod.getNodeType() == Node.ELEMENT_NODE) System.out.println(nodeList.item(i).getNodeName() + " : " + nod.getFirstChild().getNodeValue()); } }
Output:
firstname : Sherlock lastname : Homes age : 32 email : sherlock@sh.com |
3.3 Read firstname of all employees who are admin
This is again a predicate example to read firstname
of all employee who are admin (defined by type=admin).
String expression = "/Employees/Employee[@type='admin']/firstname"; System.out.println(expression); NodeList nodeList = (NodeList) xPath.compile(expression).evaluate(xmlDocument, XPathConstants.NODESET); for (int i = 0; i < nodeList.getLength(); i++) { System.out.println(nodeList.item(i).getFirstChild().getNodeValue()); }
Output:
John Sherlock |
3.4 Read firstname of all employees who are older than 40 year
See how we used predicate to filter employees who has age > 40
.
String expression = "/Employees/Employee[age>40]/firstname"; NodeList nodeList = (NodeList) xPath.compile(expression).evaluate(xmlDocument, XPathConstants.NODESET); for (int i = 0; i < nodeList.getLength(); i++) { System.out.println(nodeList.item(i).getFirstChild().getNodeValue()); }
Output:
Jim Mycroft |
3.5 Read firstname of first two employees (defined in xml file)
Within predicates, you can use position()
to identify the position of xml element. Here we are filtering first two employees using position().
String expression = "/Employees/Employee[position() <= 2]/firstname"; NodeList nodeList = (NodeList) xPath.compile(expression).evaluate(xmlDocument, XPathConstants.NODESET); for (int i = 0; i < nodeList.getLength(); i++) { System.out.println(nodeList.item(i).getFirstChild().getNodeValue()); }
Output:
John Sherlock |
4. Complete Java source code
In order to execute this source, just create a basic Java project in your IDE or just save below code in Main.java and execute. It will need employees.xml
file as input. Copy the employee xml defined in start of this tutorial at c:\\employees.xml
.
package net.viralpatel.java; import java.io.File; import java.io.FileInputStream; import java.io.FileNotFoundException; import java.io.IOException; import javax.xml.parsers.DocumentBuilder; import javax.xml.parsers.DocumentBuilderFactory; import javax.xml.parsers.ParserConfigurationException; import javax.xml.xpath.XPath; import javax.xml.xpath.XPathConstants; import javax.xml.xpath.XPathExpressionException; import javax.xml.xpath.XPathFactory; import org.w3c.dom.Document; import org.w3c.dom.Node; import org.w3c.dom.NodeList; import org.xml.sax.SAXException; public class Main { public static void main(String[] args) { try { FileInputStream file = new FileInputStream(new File("c:/employees.xml")); DocumentBuilderFactory builderFactory = DocumentBuilderFactory.newInstance(); DocumentBuilder builder = builderFactory.newDocumentBuilder(); Document xmlDocument = builder.parse(file); XPath xPath = XPathFactory.newInstance().newXPath(); System.out.println("*************************"); String expression = "/Employees/Employee[@emplid='3333']/email"; System.out.println(expression); String email = xPath.compile(expression).evaluate(xmlDocument); System.out.println(email); System.out.println("*************************"); expression = "/Employees/Employee/firstname"; System.out.println(expression); NodeList nodeList = (NodeList) xPath.compile(expression).evaluate(xmlDocument, XPathConstants.NODESET); for (int i = 0; i < nodeList.getLength(); i++) { System.out.println(nodeList.item(i).getFirstChild().getNodeValue()); } System.out.println("*************************"); expression = "/Employees/Employee[@type='admin']/firstname"; System.out.println(expression); nodeList = (NodeList) xPath.compile(expression).evaluate(xmlDocument, XPathConstants.NODESET); for (int i = 0; i < nodeList.getLength(); i++) { System.out.println(nodeList.item(i).getFirstChild().getNodeValue()); } System.out.println("*************************"); expression = "/Employees/Employee[@emplid='2222']"; System.out.println(expression); Node node = (Node) xPath.compile(expression).evaluate(xmlDocument, XPathConstants.NODE); if(null != node) { nodeList = node.getChildNodes(); for (int i = 0;null!=nodeList && i < nodeList.getLength(); i++) { Node nod = nodeList.item(i); if(nod.getNodeType() == Node.ELEMENT_NODE) System.out.println(nodeList.item(i).getNodeName() + " : " + nod.getFirstChild().getNodeValue()); } } System.out.println("*************************"); expression = "/Employees/Employee[age>40]/firstname"; nodeList = (NodeList) xPath.compile(expression).evaluate(xmlDocument, XPathConstants.NODESET); System.out.println(expression); for (int i = 0; i < nodeList.getLength(); i++) { System.out.println(nodeList.item(i).getFirstChild().getNodeValue()); } System.out.println("*************************"); expression = "/Employees/Employee[1]/firstname"; System.out.println(expression); nodeList = (NodeList) xPath.compile(expression).evaluate(xmlDocument, XPathConstants.NODESET); for (int i = 0; i < nodeList.getLength(); i++) { System.out.println(nodeList.item(i).getFirstChild().getNodeValue()); } System.out.println("*************************"); expression = "/Employees/Employee[position() <= 2]/firstname"; System.out.println(expression); nodeList = (NodeList) xPath.compile(expression).evaluate(xmlDocument, XPathConstants.NODESET); for (int i = 0; i < nodeList.getLength(); i++) { System.out.println(nodeList.item(i).getFirstChild().getNodeValue()); } System.out.println("*************************"); expression = "/Employees/Employee[last()]/firstname"; System.out.println(expression); nodeList = (NodeList) xPath.compile(expression).evaluate(xmlDocument, XPathConstants.NODESET); for (int i = 0; i < nodeList.getLength(); i++) { System.out.println(nodeList.item(i).getFirstChild().getNodeValue()); } System.out.println("*************************"); } catch (FileNotFoundException e) { e.printStackTrace(); } catch (SAXException e) { e.printStackTrace(); } catch (IOException e) { e.printStackTrace(); } catch (ParserConfigurationException e) { e.printStackTrace(); } catch (XPathExpressionException e) { e.printStackTrace(); } } }
That’s all folks :)
针对上述例子,我又想到了一种情况,发现他里面没有涉及到,于是我就尝试啊,也看了Xpath的源代码,发现还是要通过挨个遍历的方式去找,一个笨办法给大家演示下,谁如有好的方法可以给我说下,大家共享下呗。。。
我的测试代码:
package com.fit.test01; import java.io.IOException; import java.io.InputStream; import javax.xml.parsers.DocumentBuilder; import javax.xml.parsers.DocumentBuilderFactory; import javax.xml.parsers.ParserConfigurationException; import javax.xml.xpath.XPath; import javax.xml.xpath.XPathConstants; import javax.xml.xpath.XPathExpressionException; import javax.xml.xpath.XPathFactory; import org.w3c.dom.Document; import org.w3c.dom.Node; import org.w3c.dom.NodeList; import org.xml.sax.SAXException; public class XPathEmployee { public static void main(String[] args) { DocumentBuilderFactory builderFactory = DocumentBuilderFactory .newInstance(); DocumentBuilder docbuilder; InputStream is = null; try { is = XPathEmployee.class.getClassLoader().getResourceAsStream( "employees.xml"); // 一种是获取当前文件的路径,一种是获取当前文件的流,两种方式都可以,并且文件应该在当前工程的src目录下 // String strFilePath = // XPathEmployee.class.getClassLoader().getResource("employees.xml").toString(); docbuilder = builderFactory.newDocumentBuilder(); Document xmlDocument = docbuilder.parse(is); XPath xPath = XPathFactory.newInstance().newXPath(); Node node = null; NodeList nodeList = null; String expression = "/Employees/Employee[@type='admin']"; nodeList = (NodeList) xPath.compile(expression).evaluate( xmlDocument, XPathConstants.NODESET); for (int i = 0; i < nodeList.getLength(); i++) { node = nodeList.item(i); if (node.getNodeType() == Node.ELEMENT_NODE) { System.out.println(node.getNodeName() + " : " + node.getFirstChild().getNodeValue()); //这个时候才到employee层,所以需要向下再延伸。。。 if (node.hasChildNodes()) { System.out.println("----------------"); NodeList nodeList1 = node.getChildNodes(); for (int j = 0; j < nodeList1.getLength(); j++) { Node node1 = nodeList1.item(j); if (node1.getNodeType() == Node.ELEMENT_NODE) { System.out.println(node1.getNodeName() + " : " + node1.getFirstChild().getNodeValue()); } } } } } } catch (SAXException e) { e.printStackTrace(); } catch (IOException e) { e.printStackTrace(); } catch (ParserConfigurationException e1) { e1.printStackTrace(); } catch (XPathExpressionException e) { e.printStackTrace(); } finally { if (is != null) { try { is.close(); } catch (IOException e) { e.printStackTrace(); } } } } }
运行结果如下(主要想测试,如果有多个Node的情况,他的例子中都是精确到了属性,结果都是唯一的一个,那如果有多个呢?见上述方法):
Employee : ---------------- firstname : John lastname : Watson age : 30 email : johnwatson@sh.com Employee : ---------------- firstname : Sherlock lastname : Homes age : 32 email : sherlock@sh.com
相关推荐
街道级行政区划shp矢量数据,wgs84坐标系,下载直接使用
街道级行政区划shp数据,wgs84坐标系,直接下载使用。
街道级行政区划shp矢量数据,wgs84坐标系,下载直接使用
轻量级密码算法LBlock的FPGA优化实现.docx
街道级行政区划shp矢量数据,wgs84坐标系,下载直接使用
Git 资料 progit-zh-v2.1.1.pdf
街道级行政区划shp数据,wgs84坐标系,直接下载使用。
篮球计分器FPGA附程序..doc
街道级行政区划shp数据,wgs84坐标系,直接下载使用。
内容概要:本文档全面介绍了Linux开发的基础知识、应用场景、环境搭建、常用命令、Shell脚本编程以及C/C++和Python开发等内容。首先阐述了Linux开发的重要性及其在服务器端开发、嵌入式开发和系统运维等领域的广泛应用。接着详细讲解了如何选择合适的Linux发行版并安装系统,配置开发环境,包括安装必要的开发工具和配置SSH服务。文档还深入讲解了Linux基础命令,如文件和目录操作、文件内容查看与编辑、进程管理和权限管理。此外,介绍了Shell脚本编程的基本语法,包括变量、条件语句、循环语句和函数定义。针对C/C++和Python开发,文档分别讲解了编译器安装、程序编写与编译、调试方法及使用虚拟环境等内容。最后,简要介绍了Linux内核开发的相关知识,包括下载编译内核、内核模块开发等,并推荐了相关学习资源。 适合人群:对Linux开发感兴趣的初学者及有一定经验的研发人员,尤其是希望深入掌握Linux开发技能的开发者。 使用场景及目标:①掌握Linux开发环境的搭建与配置;②熟悉Linux基础命令和Shell脚本编程;③学习C/C++和Python在Linux下的开发流程;④了解Linux内核开发的基本概念和技术。 阅读建议:此文档内容丰富,涵盖面广,建议读者根据自身需求选择性阅读,并结合实际操作进行练习。特别是对于初学者,应先掌握基础命令和开发环境的搭建,再逐步深入到编程语言和内核开发的学习。
街道级行政区划shp数据,wgs84坐标系,直接下载使用。
街道级行政区划shp数据,wgs84坐标系,直接下载使用。
街道级行政区划shp数据,wgs84坐标系,直接使用。
内容概要:本文档《word练习题.docx》是一份详细的Word操作练习指南,涵盖了从基础到高级的各种功能。文档分为三个主要部分:内容编辑、页面布局和高效文档。内容编辑部分包括文本格式化、段落设置、项目编号、制表位、边框与底纹等练习;页面布局部分涉及分节符、分栏、页眉页脚、水印等设置;高效文档部分则聚焦于样式管理、导航窗格、题注、书签、超级链接、脚注与尾注、交叉引用等功能。每个练习都有具体的操作步骤,帮助用户掌握Word的各种实用技巧。 适合人群:适用于Word初学者及希望提高Word技能的中级用户,尤其是需要频繁使用Word进行文档编辑和排版的办公人员。 使用场景及目标:①帮助用户熟悉Word的基本操作,如文本编辑、格式设置等;②提升用户的文档排版能力,学会设置复杂的页面布局;③提高工作效率,掌握高效文档管理技巧,如样式应用、题注和交叉引用等。 其他说明:此文档不仅提供了具体的练习题目,还附带了详细的步骤说明,用户可以根据指引逐步完成每个练习。此外,文档中的一些练习涉及到智能文档和Office智能客户端的应用,有助于用户了解Word在企业级应用中的潜力。建议用户按照章节顺序逐步学习,实践每一个练习,以达到最佳的学习效果。
街道级行政区划shp数据,wgs84坐标系,直接下载使用。
全球腐败感知数据(2000-2023)——3000行 33个指标 关于数据集 该数据集包含3000行和33列,涵盖了2000年至2023年的腐败感知指数(CPI)数据和各种治理指标。它包括国家排名、分数和其他指标,如公共部门腐败、司法腐败、贿赂指数、商业道德、民主指数、法治、政府效率、经济指标和人类发展指数。 这些数据可用于: 腐败趋势分析 腐败对GDP、人类发展指数和治理的影响 跨国比较 数据可视化和机器学习模型 该数据集对研究人员、数据分析师、政策制定者和对研究全球腐败趋势非常有用。
毕业设计(论文) 基于FPGA的数字频率计设计.doc
街道级行政区划shp数据,wgs84坐标系,直接使用。
NTI1NDU3NTAyODMwOTQxMzI0M18xNzQ0Nzk1MTk1OTgz_6.JPG
街道级行政区划shp数据,wgs84坐标系,直接下载使用。