`
zhyiwww
  • 浏览: 90036 次
最近访客 更多访客>>
文章分类
社区版块
存档分类
最新评论

Parse XML using Dom4j(转载)

阅读更多

Parsing XML

One of the first things you'll probably want to do is to parse an XML document of some kind. This is easy to do in dom4j. The following code demonstrates how to this.

import java.net.URL;

import org.dom4j.Document;
import org.dom4j.DocumentException;
import org.dom4j.io.SAXReader;

public class Foo {

    public Document parse(URL url) throws DocumentException {
        SAXReader reader = new SAXReader();
        Document document = reader.read(url);
        return document;
    }
}

Using Iterators

A document can be navigated using a variety of methods that return standard Java Iterators. For example

    public void bar(Document document) throws DocumentException {

        Element root = document.getRootElement();

        // iterate through child elements of root
        for ( Iterator i = root.elementIterator(); i.hasNext(); ) {
            Element element = (Element) i.next();
            // do something
        }

        // iterate through child elements of root with element name "foo"
        for ( Iterator i = root.elementIterator( "foo" ); i.hasNext(); ) {
            Element foo = (Element) i.next();
            // do something
        }

        // iterate through attributes of root 
        for ( Iterator i = root.attributeIterator(); i.hasNext(); ) {
            Attribute attribute = (Attribute) i.next();
            // do something
        }
     }

Powerful Navigation with XPath

In dom4j XPath expressions can be evaluated on the Document or on any Node in the tree (such as Attribute, Element or ProcessingInstruction). This allows complex navigation throughout the document with a single line of code. For example.

    public void bar(Document document) {
        
       //Get the value of the node using the XPath List list = document.selectNodes( "//foo/bar" ); Node node = document.selectSingleNode( "//foo/bar/author" ); String name = node.valueOf( "@name" ); }

For example if you wish to find all the hypertext links in an XHTML document the following code would do the trick.

    public void findLinks(Document document) throws DocumentException {

        List list = document.selectNodes( "//a/@href" );

        for (Iterator iter = list.iterator(); iter.hasNext(); ) {
            Attribute attribute = (Attribute) iter.next();
            String url = attribute.getValue();
        }
    }

If you need any help learning the XPath language we highly recommend the Zvon tutorial which allows you to learn by example.

Fast Looping

If you ever have to walk a large XML document tree then for performance we recommend you use the fast looping method which avoids the cost of creating an Iterator object for each loop. For example

    public void treeWalk(Document document) {
        treeWalk( document.getRootElement() );
    }

    public void treeWalk(Element element) {
        for ( int i = 0, size = element.nodeCount(); i < size; i++ ) {
            Node node = element.node(i);
            if ( node instanceof Element ) {
                treeWalk( (Element) node );
            }
            else {
                // do something....
            }
        }
    }

Creating a new XML document

Often in dom4j you will need to create a new document from scratch. Here's an example of doing that.

import org.dom4j.Document;
import org.dom4j.DocumentHelper;
import org.dom4j.Element;

public class Foo {

    public Document createDocument() {
        Document document = DocumentHelper.createDocument();
        Element root = document.addElement( "root" );

        Element author1 = root.addElement( "author" )
            .addAttribute( "name", "James" )
            .addAttribute( "location", "UK" )
            .addText( "James Strachan" );
        
        Element author2 = root.addElement( "author" )
            .addAttribute( "name", "Bob" )
            .addAttribute( "location", "US" )
            .addText( "Bob McWhirter" );

        return document;
    }
}

Writing a document to a file

A quick and easy way to write a Document (or any Node) to a Writer is via the write() method.

  FileWriter out = new FileWriter( "foo.xml" );
  document.write( out );

If you want to be able to change the format of the output, such as pretty printing or a compact format, or you want to be able to work with Writer objects or OutputStream objects as the destination, then you can use the XMLWriter class.

import org.dom4j.Document;
import org.dom4j.io.OutputFormat;
import org.dom4j.io.XMLWriter;

public class Foo {

    public void write(Document document) throws IOException {

        // lets write to a file
        XMLWriter writer = new XMLWriter(
            new FileWriter( "output.xml" )
        );
        writer.write( document );
        writer.close();


        // Pretty print the document to System.out
        OutputFormat format = OutputFormat.createPrettyPrint();
        writer = new XMLWriter( System.out, format );
        writer.write( document );

        // Compact format to System.out
        format = OutputFormat.createCompactFormat();
        writer = new XMLWriter( System.out, format );
        writer.write( document );
    }
}

Converting to and from Strings

If you have a reference to a Document or any other Node such as an Attribute or Element, you can turn it into the default XML text via the asXML() method.

        Document document = ...;
        String text = document.asXML();

If you have some XML as a String you can parse it back into a Document again using the helper method DocumentHelper.parseText()

        String text = "<person> <name>James</name> </person>";
        Document document = DocumentHelper.parseText(text);

Styling a Document with XSLT

Applying XSLT on a Document is quite straightforward using the JAXP API from Sun. This allows you to work against any XSLT engine such as Xalan or SAXON. Here is an example of using JAXP to create a transformer and then applying it to a Document.

import javax.xml.transform.Transformer;
import javax.xml.transform.TransformerFactory;

import org.dom4j.Document;
import org.dom4j.io.DocumentResult;
import org.dom4j.io.DocumentSource;

public class Foo {

    public Document styleDocument(
        Document document, 
        String stylesheet
    ) throws Exception {

        // load the transformer using JAXP
        TransformerFactory factory = TransformerFactory.newInstance();
        Transformer transformer = factory.newTransformer( 
            new StreamSource( stylesheet ) 
        );

        // now lets style the given document
        DocumentSource source = new DocumentSource( document );
        DocumentResult result = new DocumentResult();
        transformer.transform( source, result );

        // return the transformed document
        Document transformedDoc = result.getDocument();
        return transformedDoc;
    }
}


zhyiwww 2006-10-24 19:26 发表评论
分享到:
评论

相关推荐

    C++解析xml文档或者xml字符串方法

    在这个例子中,`parseXMLString`函数接收一个XML字符串,将其转化为`BSTR`,然后调用`RealPlay`函数。在`RealPlay`中,使用`loadXML`方法解析XML字符串,而不是`load`方法,因为`loadXML`用于解析内存中的XML字符串...

    XML and Java - Parsing XML using Java Tutorial [转]

    Java作为一个强大的编程语言,提供了多种API来解析XML文档,包括DOM(Document Object Model)、SAX(Simple API for XML)和StAX(Streaming API for XML)。本教程将聚焦于如何使用Java解析XML文件。 首先,让...

    用Dom树解析XML

    qDebug() &lt;&lt; "Failed to parse XML"; return; } file.close(); ``` 这里,我们使用`setContent`方法将XML文件内容加载到`QDomDocument`对象中。如果解析失败,该方法会返回`false`。 接下来,我们可以使用DOM树的...

    C# XML文件读取示例

    4. **使用LINQ to XML (XDocument and XElement)**: .NET Framework 3.5引入了LINQ to XML,它提供了更现代、更直观的方式来处理XML。 ```csharp using System.Linq; using System.Xml.Linq; XDocument doc ...

    读取XML读取XML

    4. JavaScript中的XML读取 在浏览器环境中,可以使用`XMLHttpRequest`或`fetch`来获取XML文件,然后用`DOMParser`解析: ```javascript let xhr = new XMLHttpRequest(); xhr.open("GET", "TestXml.xml", true); ...

    c#中对XML文件进行读写操作

    它提供了一种基于DOM(Document Object Model)的处理方式,允许你遍历整个XML树并进行修改。 2. **XmlReader/XmlWriter**: 提供了基于SAX(Simple API for XML)的流式读写方法,适合处理大型XML文件,因为它不...

    c#和java读写xml辅助工具

    4. **JAXB**: 用于对象到XML和XML到对象的自动转换。 ```java import javax.xml.bind.*; @XmlRootElement(name = "root") public class MyObject { private String value; // getters, setters } JAXBContext ...

    得到XML节点的属性和文字

    using System.Xml; XmlDocument xmlDoc = new XmlDocument(); xmlDoc.Load("example.xml"); XmlNodeList nodeList = xmlDoc.SelectNodes("//node_name"); foreach (XmlNode node in nodeList) { string attrValue...

    xerces解析xml

    Xerces-C++是XML规范的C++实现,它提供了全面的DOM(文档对象模型)、SAX(简单API for XML)和XInclude处理。这个库支持最新的XML、XPath、XSLT、XML Schema和XML Infoset等标准,使得开发者可以方便地在C++应用...

    读取XML元素值.rar

    using System.Xml.Linq; XDocument doc = XDocument.Load("your_xml_file.xml"); var elements = doc.Descendants("element_name"); foreach (var element in elements) { Console.WriteLine(element.Value); } `...

    cocos2d-x XML解析

    TinyXML支持读取XML文件,将其转换为DOM(Document Object Model)树结构,然后可以通过遍历这个树来访问和修改XML元素。例如,可以使用`TiXmlElement`、`TiXmlDocument`、`TiXmlAttribute`等类来创建、查找和操作...

    示例使用 XML 规范及 .net XML 解析 Office Execl 2007......

    首先,.NET Framework提供了两种主要的XML解析器:DOM(Document Object Model)和SAX(Simple API for XML)。DOM解析器将整个XML文档加载到内存中,形成一个树形结构,方便进行遍历和修改;而SAX解析器则采用事件...

    .net 读取xml

    - **使用`XmlDocument`类**:这是一个基于DOM(Document Object Model)的模型,可以加载整个XML文档到内存中,便于遍历和操作。例如: ```csharp XmlDocument xmlDoc = new XmlDocument(); xmlDoc.Load("path_...

    vc++ xml程序 基础

    4. **XML节点操作**:熟悉`IXMLDOMNode`接口,包括获取和设置节点属性,以及遍历和操作XML树。 5. **XPath**:掌握XPath语言,用于选取XML文档中的节点,这对于定位和操作特定数据非常有用。 6. **XML事件处理**:...

    vc读取xml内容

    IXMLDOMDocumentPtr pDoc.CreateInstance(__uuidof(DOMDocument)); ``` 在创建`IXMLDOMDocument`对象后,我们可以加载XML文件: ```cpp BSTR bstrFileName = SysAllocString(L"your_xml_file.xml"); HRESULT hr = ...

    c++中使用libxml2读取xml文件

    doc = xmlReadFile(szDocName, "UTF-8", XML_PARSE_RECOVER); // 解析文件 if (NULL == doc) { // 文档打开错误 return -1; } curNode = xmlDocGetRootElement(doc); // 获取根元素 if (NULL == curNode) ...

    C#对XML文件的操作

    `XDocument`是LINQ to XML的一部分,提供简洁的API,而`XmlDocument`则更接近DOM模型。例如,使用`XDocument`创建一个简单的XML文档: ```csharp XDocument doc = new XDocument( new XElement("Root", new ...

    C++ XML文件读取

    这些库提供了更完整的XML解析和操作功能,包括命名空间支持、错误处理和DOM(Document Object Model)树构建等。 总的来说,C++通过Boost库提供了一种处理XML文件的有效方法,尤其是Spirit.Qi模块。不过,根据项目...

    c# 对的所有XML操作

    在C#编程中,XML(eXtensible Markup Language)是一种重要的数据交换...对于更复杂的场景,如XML Schema验证、DOM操作等,C#也提供了相应的支持。记得在使用XML时,注意处理可能出现的错误和异常,确保程序的健壮性。

    c#读取XML的几种方法.pdf

    XmlDocument 是一个文档对象模型(DOM),它允许我们编辑和更新 XML 文档,可以随机访问文档中的数据,还可以使用 XPath 查询。使用 XmlDocument,我们可以加载整个 XML 文档到内存中,然后使用 XPath 语法来查询和...

Global site tag (gtag.js) - Google Analytics