- 浏览: 9807 次
最新评论
近日在使用 DOM4J 的时候,遇到一个问题,现在有两个解决方法,先记下来,以便日后使用。
问题:对一个 XML 文件进行读写操作,但是发现当文件存在的时候,使用DOM4J读进来的时候,生成的 Document 对象会根据 DTD 里的定义,追加了一些 default 属性(实际不需要)。而且在读取的时间被延长。
有一个 XML 文件如下:<o:p></o:p>
- <?xml version="1.0" encoding="UTF-8"?>
- <!DOCTYPE beans PUBLIC "-//SPRING//DTD BEAN//EN" "http://www.springframework.org/dtd/spring-beans.dtd">
- <beans>
- ....
- </beans>
使用 DOM4J 的读取文件的一般性写法:<o:p></o:p>
- SAXReader reader = new SAXReader(false);
- document = reader.read(file);
- root = document.getRootElement();
对象 document 里的节点会被自动追加 DTD 里的定义的 default 属性,只有新增加的节点不受影响,如下。而且,如果文件的操作时间被延长。
- <?xml version="1.0" encoding="UTF-8"?>
- <!DOCTYPE beans PUBLIC "-//SPRING//DTD BEAN//EN" "http://www.springframework.org/dtd/spring-beans.dtd">
- <beans default-lazy-init="false" default-autowire="no" default-dependency-check="none">
- <bean id="OperateXmlByDom4jTestCase" class="test.OperateXmlByDom4jTestCase" lazy-init="default" autowire="default" dependency-check="default"/>
- <bean id="OperateXmlByDom4jTestCase" class="test.OperateXmlByDom4jTestCase" lazy-init="default" autowire="default" dependency-check="default"/>
- <bean id="OperateXmlByDom4jTestCase" class="test.OperateXmlByDom4jTestCase" lazy-init="default" autowire="default" dependency-check="default"/>
- <bean id="OperateXmlByDom4jTestCase" class="test.OperateXmlByDom4jTestCase" lazy-init="default" autowire="default" dependency-check="default"/>
- <bean id="OperateXmlByDom4jTestCase" class="test.OperateXmlByDom4jTestCase" lazy-init="default" autowire="default" dependency-check="default"/>
- <bean id="OperateXmlByDom4jTestCase" class="test.OperateXmlByDom4jTestCase" lazy-init="default" autowire="default" dependency-check="default"/>
- <bean id="OperateXmlByDom4jTestCase" class="test.OperateXmlByDom4jTestCase" lazy-init="default" autowire="default" dependency-check="default"/>
- <bean id="OperateXmlByDom4jTestCase" class="test.OperateXmlByDom4jTestCase"/>
- <bean id="OperateXmlByDom4jTestCase" class="test.OperateXmlByDom4jTestCase"/>
- <bean id="OperateXmlByDom4jTestCase" class="test.OperateXmlByDom4jTestCase"/>
- </beans>
为了不让生成我们不需要的 default 属性和缩短文件的操作时间,我们可以调用 SAXReader.setFeature 方法来改变 DOM4J 的行为,片断代码如下:<o:p></o:p>
- // http://apache.org/xml/features/nonvalidating/load-external-dtd"
- saxReader.setFeature(
- Constants.XERCES_FEATURE_PREFIX + Constants.LOAD_EXTERNAL_DTD_FEATURE,
- false);
关于更多的 Feature 请参考 com.sun.org.apache.xerces.internal.impl.Constants<o:p></o:p>
以下为片断代码:<o:p></o:p>
- // xerces features
- /** Xerces features prefix ("http://apache.org/xml/features/"). */
- public static final String XERCES_FEATURE_PREFIX = "http://apache.org/xml/features/";
- /** Schema validation feature ("validation/schema"). */
- public static final String SCHEMA_VALIDATION_FEATURE = "validation/schema";
- /** Expose schema normalized values */
- public static final String SCHEMA_NORMALIZED_VALUE = "validation/schema/normalized-value";
- /** Send schema default value via characters() */
- public static final String SCHEMA_ELEMENT_DEFAULT = "validation/schema/element-default";
- /** Schema full constraint checking ("validation/schema-full-checking"). */
- public static final String SCHEMA_FULL_CHECKING = "validation/schema-full-checking";
- /** Augment Post-Schema-Validation-Infoset */
- public static final String SCHEMA_AUGMENT_PSVI = "validation/schema/augment-psvi";
- /** Dynamic validation feature ("validation/dynamic"). */
- public static final String DYNAMIC_VALIDATION_FEATURE = "validation/dynamic";
- /** Warn on duplicate attribute declaration feature ("validation/warn-on-duplicate-attdef"). */
- public static final String WARN_ON_DUPLICATE_ATTDEF_FEATURE = "validation/warn-on-duplicate-attdef";
- /** Warn on undeclared element feature ("validation/warn-on-undeclared-elemdef"). */
- public static final String WARN_ON_UNDECLARED_ELEMDEF_FEATURE = "validation/warn-on-undeclared-elemdef";
- /** Warn on duplicate entity declaration feature ("warn-on-duplicate-entitydef"). */
- public static final String WARN_ON_DUPLICATE_ENTITYDEF_FEATURE = "warn-on-duplicate-entitydef";
- /** Allow Java encoding names feature ("allow-java-encodings"). */
- public static final String ALLOW_JAVA_ENCODINGS_FEATURE = "allow-java-encodings";
- /** Disallow DOCTYPE declaration feature ("disallow-doctype-decl"). */
- public static final String DISALLOW_DOCTYPE_DECL_FEATURE = "disallow-doctype-decl";
- /** Continue after fatal error feature ("continue-after-fatal-error"). */
- public static final String CONTINUE_AFTER_FATAL_ERROR_FEATURE = "continue-after-fatal-error";
- /** Load dtd grammar when nonvalidating feature ("nonvalidating/load-dtd-grammar"). */
- public static final String LOAD_DTD_GRAMMAR_FEATURE = "nonvalidating/load-dtd-grammar";
- /** Load external dtd when nonvalidating feature ("nonvalidating/load-external-dtd"). */
- public static final String LOAD_EXTERNAL_DTD_FEATURE = "nonvalidating/load-external-dtd";
- /** Defer node expansion feature ("dom/defer-node-expansion"). */
- public static final String DEFER_NODE_EXPANSION_FEATURE = "dom/defer-node-expansion";
- /** Create entity reference nodes feature ("dom/create-entity-ref-nodes"). */
- public static final String CREATE_ENTITY_REF_NODES_FEATURE = "dom/create-entity-ref-nodes";
- /** Include ignorable whitespace feature ("dom/include-ignorable-whitespace"). */
- public static final String INCLUDE_IGNORABLE_WHITESPACE = "dom/include-ignorable-whitespace";
- /** Default attribute values feature ("validation/default-attribute-values"). */
- public static final String DEFAULT_ATTRIBUTE_VALUES_FEATURE = "validation/default-attribute-values";
- /** Validate content models feature ("validation/validate-content-models"). */
- public static final String VALIDATE_CONTENT_MODELS_FEATURE = "validation/validate-content-models";
- /** Validate datatypes feature ("validation/validate-datatypes"). */
- public static final String VALIDATE_DATATYPES_FEATURE = "validation/validate-datatypes";
- /** Notify character references feature (scanner/notify-char-refs"). */
- public static final String NOTIFY_CHAR_REFS_FEATURE = "scanner/notify-char-refs";
- /** Notify built-in (&, etc.) references feature (scanner/notify-builtin-refs"). */
- public static final String NOTIFY_BUILTIN_REFS_FEATURE = "scanner/notify-builtin-refs";
- /** Standard URI conformant feature ("standard-uri-conformant"). */
- public static final String STANDARD_URI_CONFORMANT_FEATURE = "standard-uri-conformant";
- /** Internal performance related feature:
- * false - the parser settings (features/properties) have not changed between 2 parses
- * true - the parser settings have changed between 2 parses
- * NOTE: this feature should only be set by the parser configuration.
- */
- public static final String PARSER_SETTINGS = "internal/parser-settings";
- /** Feature to make XML Processor XInclude Aware */
- public static final String XINCLUDE_AWARE = "xinclude-aware";
- /** Ignore xsi:schemaLocation and xsi:noNamespaceSchemaLocation. */
- public static final String IGNORE_SCHEMA_LOCATION_HINTS = "validation/schema/ignore-schema-location-hints";
- /**
- * When true, the schema processor will change characters events
- * to ignorableWhitespaces events, when characters are expected to
- * only contain ignorable whitespaces.
- */
- public static final String CHANGE_IGNORABLE_CHARACTERS_INTO_IGNORABLE_WHITESPACES =
- "validation/change-ignorable-characters-into-ignorable-whitespaces";
除通过上面的 SAXReader.setFeature 文件之外 ,我们还可以通过自己的 EntityResolver 来解决这个问题。<o:p></o:p>
PS:这个方法是从凝香小筑的BLOG里的一编主题是:Do not resolve DTD files when dom4j read xml file 的文章里看到的。地址:http://blog.csdn.net/lessoft/archive/<st1:chsdate w:st="on" isrocdate="False" islunardate="False" day="20" month="6" year="2007">2007/06/20</st1:chsdate>/1659579.aspx<o:p></o:p>
代码片断如下:<o:p></o:p>
- saxReader.setEntityResolver(new EntityResolver() {
- String emptyDtd = "";
- ByteArrayInputStream bytels = new ByteArrayInputStream(emptyDtd.getBytes());
- public InputSource resolveEntity(String publicId, String systemId)
- throws SAXException, IOException {
- return new InputSource(bytels);
- }
- });
<o:p> </o:p>
完整的代码如下:<o:p></o:p>
- package test;
- import java.io.BufferedWriter;
- import java.io.ByteArrayInputStream;
- import java.io.File;
- import java.io.FileWriter;
- import java.io.IOException;
- import junit.framework.TestCase;
- import org.dom4j.Document;
- import org.dom4j.DocumentHelper;
- import org.dom4j.Element;
- import org.dom4j.io.OutputFormat;
- import org.dom4j.io.SAXReader;
- import org.dom4j.io.XMLWriter;
- import org.dom4j.tree.DefaultDocumentType;
- import org.xml.sax.EntityResolver;
- import org.xml.sax.InputSource;
- import org.xml.sax.SAXException;
- import com.sun.org.apache.xerces.internal.impl.Constants;
- /**
- * A test case class for read and writer a xml file by Dom4j.
- * @author X.F.Yang [2007/07/03]
- * @version 1.0
- */
- public class OperateXmlByDom4jTestCase extends TestCase {
- /**
- * Default way to read and writer a xml file by Dom4j.
- * @throws Exception
- */
- public void testWriteXml() throws Exception {
- XmlFileOperation operation = new XmlFileOperation();
- operation.writer(new SAXReaderWrapper() {
- public void operation(SAXReader saxReader) throws Exception {
- // Nothing to do.
- }
- });
- }
- /**
- * Do not resolve DTD files when dom4j read xml file via the set feature.
- * @throws Exception
- */
- public void testWriteXmlSetFeature() throws Exception {
- XmlFileOperation operation = new XmlFileOperation();
- operation.writer(new SAXReaderWrapper() {
- public void operation(SAXReader saxReader) throws Exception {
- // http://apache.org/xml/features/nonvalidating/load-external-dtd"
- saxReader.setFeature(
- Constants.XERCES_FEATURE_PREFIX + Constants.LOAD_EXTERNAL_DTD_FEATURE,
- false);
- }
- });
- }
- /**
- * Do not resolve DTD files when dom4j read xml file via implement {@link EntityResolver}.
- * @throws Exception
- */
- public void testWriteXmlEntityResolver() throws Exception {
- XmlFileOperation operation = new XmlFileOperation();
- operation.writer(new SAXReaderWrapper() {
- public void operation(SAXReader saxReader) throws Exception {
- saxReader.setEntityResolver(new EntityResolver() {
- String emptyDtd = "";
- ByteArrayInputStream bytels = new ByteArrayInputStream(emptyDtd.getBytes());
- public InputSource resolveEntity(String publicId,
- String systemId) throws SAXException, IOException {
- return new InputSource(bytels);
- }
- });
- }
- });
- }
- /** */
- protected interface SAXReaderWrapper {
- /** operation {@link SAXReader} */
- void operation(SAXReader saxReader) throws Exception;
- }
- /**
- * when the target file was existed, read and append the new element.
- * else, create a new xml file and add the new element.
- */
- protected class XmlFileOperation {
- /** target file */
- private File file;
- public XmlFileOperation() {
- // target file
- file = new File("d:\\spring.xml");
- }
- /**
- * Write xml file
- * @param wrapper
- * @throws Exception
- * @see {@link SAXReaderWrapper}
- */
- public void writer(SAXReaderWrapper wrapper) throws Exception {
- try {
- Document document = null;
- Element root = null;
- // read the xml file if target file was existed
- if (file.exists()) {
- SAXReader reader = new SAXReader(false);
- wrapper.operation(reader);
- document = reader.read(file);
- root = document.getRootElement();
- // if the target file was not existed, create a new one
- } else {
- document = DocumentHelper.createDocument();
- document.setDocType(new DefaultDocumentType("beans",
- "-//SPRING//DTD BEAN//EN",
- "http://www.springframework.org/dtd/spring-beans.dtd"));
- root = document.addElement("beans");
- }
- // create the element under the root element
- root.addElement("bean")
- .addAttribute("id", "OperateXmlByDom4jTestCase")
- .addAttribute("class", "test.OperateXmlByDom4jTestCase");
- // writer the document
- writer(document);
- } catch (Exception e) {
- e.printStackTrace();
- throw e;
- }
- }
- protected void writer(Document document) throws Exception {
- XMLWriter xmlWriter = null;
- try {
- final OutputFormat format = OutputFormat.createPrettyPrint();
- xmlWriter = new XMLWriter(new BufferedWriter(new FileWriter(file)), format);
- xmlWriter.write(document);
- } finally {
- if (null != xmlWriter) {
- xmlWriter.flush();
- xmlWriter.close();
- }
- }
- }
- }
- }
相关推荐
在处理XML文件时,经常需要用到DOM4J这样的库来进行解析。当XML文件包含DTD(Document Type Definition)声明时,DOM4J默认会尝试从指定的URL加载DTD文件来进行验证。这通常是为了确保XML文件符合预定的结构和规则。...
- **配置文件解析**: 许多应用使用XML作为配置文件的格式,DOM4J可以方便地读取和更新这些配置信息。 - **Web服务**: 在SOAP等协议中,XML是常用的数据传输格式,DOM4J能够帮助构建和解析这些XML消息。 - **文档生成...
**DOM4J的读取XML:** 1. **创建Document对象**:使用`DocumentHelper.parseText()`或`SAXReader.read()`方法解析XML字符串或文件,生成Document对象。 2. **获取Element**:通过Document对象的`rootElement()`方法...
DOM4J作为一个解析器,它的主要功能包括读取XML文档、遍历XML结构、查找特定元素、修改元素内容以及创建新的XML文档。 首先,解析XML文件通常从构建Document对象开始。在DOM4J中,我们可以使用SAXReader类来实现这...
DOM4J的名字来源于Document Object Model (DOM) 和 Java,它弥补了DOM API在处理大型XML文档时效率低下的问题,并且比DOM更易于使用。 DOM4J的核心概念是Element,它代表XML文档中的一个节点,可以是元素、属性、...
在实际开发中,DOM4J常用于读取配置文件、解析XML格式的数据交换,或者与Web服务交互。由于其优秀的性能和丰富的功能,DOM4J 1.6.1版本至今仍被许多开发者所采用,特别是在那些需要高效处理XML的项目中。 总之,DOM...
例如,使用DOM4J解析XML文件时,你可以创建`Document`对象,然后通过`DocumentBuilderFactory`和`DocumentBuilder`来读取XML文件。在MySQL数据库操作中,可以创建`Connection`对象,使用`Statement`或`...
DOM4J则提供了一种轻量级的替代方案,它不仅支持DOM,还引入了SAX(Simple API for XML)和StAX(Streaming API for XML)的特性,使处理大型XML文件时性能更优。 DOM4J的主要特点包括: 1. **易于使用**:DOM4J的...
在处理大型 XML 文件时,JAXB 可能不是最佳选择,因为它会将整个 XML 文档加载到内存中。 2. DOM:DOM 是一个 W3C 标准,它提供了一种树形结构来表示整个 XML 文档。通过 DOM,你可以遍历和修改 XML 文档的任何部分...
2. **SAX和StAX支持**:除了DOM,DOM4J还支持事件驱动的SAX和流式的StAX解析模型,这些模型在处理大型XML文件时特别有用,因为它们不需要将整个文档加载到内存中。 3. **XPath支持**:DOM4J内置了XPath支持,这使得...
在构建工程时,需要将DOM4J的JAR文件添加到类路径中。提供的"dom4j"文件名可能是解压后的所有文件,其中包括了DOM4J的JAR包和其他相关资源。 总的来说,DOM4J是一个强大且全面的XML处理工具,无论是在小型项目还是...
1. **加载XML文档**:使用DOM4J的`DocumentHelper`类的`parseText()`或`parse()`方法读取XML文档。例如: ```java String xmlContent = "<!DOCTYPE root SYSTEM 'my.dtd'><root><element>content</element></root>...
10. **IO操作**:DOM4J提供了一套简单的API来读写XML文件,支持流式处理,可以高效地处理大文件。 总结来说,DOM4J-1.6.1.jar是一个强大且易用的XML处理库,适合各种Java项目中对XML的读取、修改和创建需求。通过...
DOM4J是一个强大的Java库,专门用于处理XML文档。它提供了灵活、高性能的方式来解析、创建、修改和操作XML数据。这个“dom4j解析xml例子”压缩包应该包含了一些示例代码、帮助文档和使用说明,旨在帮助初学者快速...
DOM4J也支持SAX和DOM解析器,这使得它既可以在内存效率至关重要的情况下使用SAX的事件驱动模型,也可以在处理大型XML文件时选择DOM的完整文档对象模型。此外,DOM4J还提供了XPath支持,允许我们使用简洁的表达式来...
在读取XML文件后,可以通过DOM提供的API进行修改,如`Element`的`setTextContent`、`setAttribute`等方法。修改完成后,需使用`Transformer`将`Document`对象转换回XML字符串,保存到文件。 8. **XPath和XSLT** ...
2. **读取XML**:通过解析库,我们可以访问XML文件中的节点和属性,提取所需信息。例如,查询特定元素的值或获取整个文档的结构。 3. **写入XML**:创建新的XML文件或修改已有文件涉及添加、删除或更新元素、属性。...
总的来说,这些文件涵盖了XML的基础知识,如XML结构、DTD的使用、XSLT转换以及DOM解析,同时也涉及到XML Schema这一高级验证工具。通过学习这些资料,你将能够有效地创建、验证和处理XML文档,以及进行数据的格式...
- 在处理大型 XML 文档时,DOM4J 结合 SAX 可以有效地节省内存资源。 5. **DOM4J的应用场景** - Web 开发:用于服务器端 XML 数据的解析和处理,如配置文件的读取和写入。 - 数据交换:在不同系统间通过 XML ...