dom4j中两种DocumentFactory对性能的影响

全部 Hibernate Spring Struts iBATIS 企业应用 Lucene SOA Java综合 Tomcat 设计模式 OO JBoss

浏览 2644 次

锁定老帖子主题：dom4j中两种DocumentFactory对性能的影响精华帖 (0) :: 良好帖 (0) :: 新手帖 (0) :: 隐藏帖 (0)
作者	正文
theoffspring 等级: 性别: 文章: 218 积分: 190 来自: 大连	发表时间：2011-11-08 最后修改：2011-11-08 相关推荐: java jdom dom4j_实例分析jdom和dom4j的使用和区别 java rss dom4j_java使用xpath和dom4j解析xml java使用dom4j解析xml DOM4J XPath读取具有命名空间的XML文档元素 xpath和dom4j解析xml 更多相关推荐 Java综合一种是默认的DocumentFactory，第二个是IndexedDocumentFactory，《Java And XML》一书中说，后者会把元素名装载到一个Map中，所以查找元素时性能比较好。但经过测试，并不是使用了它就会提高性能的，是在一定的条件下，才会产生作用。先把完整测试类贴出来，包含生成测试数据的方法。 package javaxml3; import org.dom4j.; import org.dom4j.io.OutputFormat; import org.dom4j.io.SAXReader; import org.dom4j.io.XMLWriter; import org.dom4j.util.IndexedDocumentFactory; import java.io.File; import java.io.FileWriter; import java.io.IOException; public class XMLReadAllNodesExample2 { public static final String sampleName = "e:/temp/4.xml"; public static final int findCount = 1000; public static void main(String[] args) throws DocumentException, IOException { generateSampleFileWithDiffName(sampleName, 2000); System.out.println("default factory: " + testFindByDefaultFactory("25", findCount)); System.out.println("indexed factory: " + testFindByIndexedFactory("25", findCount)); } /* * generate sample xml which will be saved to path--sampleName * * @param fileName * @param amount * @throws IOException / private static void generateSampleFileWithDiffName(String fileName, int amount) throws IOException { DocumentFactory factory = DocumentFactory.getInstance(); Document doc = factory.createDocument(); addElement(factory, doc, "Company"); Element root = doc.getRootElement(); for (int i = 0; i < amount; i++) { Element person = addElement(factory, root, "person" + (i + 1), i + 1); } OutputFormat format = OutputFormat.createPrettyPrint(); format.setSuppressDeclaration(false); XMLWriter writer = new XMLWriter(new FileWriter(fileName), format); writer.write(doc); writer.close(); } private static void addElement(DocumentFactory factory, Document parent, String name) { parent.add(factory.createElement(name)); } private static Element addElement(DocumentFactory factory, Element parent, String name, Object value) { Element newElem = factory.createElement(name); parent.add(newElem.addText(value + "")); return newElem; } /* * find element person25 with DocumentFactory * * @param id * @param count * @return * @throws DocumentException / private static long testFindByDefaultFactory(String id, int count) throws DocumentException { SAXReader reader = new SAXReader(DocumentFactory.getInstance()); Document doc = reader.read(new File(sampleName)); Node root = doc.selectSingleNode("Company"); XPath xpath = DocumentHelper.createXPath("person" + id); long start = System.currentTimeMillis(); for (int i = 0; i < count; i++) { root.selectSingleNode("person" + id);//使用doc.selectSingleNode，则xpath为"//person"+id } long end = System.currentTimeMillis(); long elapsed = end - start; return elapsed; } /* * find element person25 with IndexedDocumentFactory * * @param id * @param count * @return * @throws DocumentException */ private static long testFindByIndexedFactory(String id, int count) throws DocumentException { SAXReader reader = new SAXReader(IndexedDocumentFactory.getInstance()); Document doc = reader.read(new File(sampleName)); Node root = doc.selectSingleNode("Company"); long start = System.currentTimeMillis(); for (int i = 0; i < count; i++) { root.selectSingleNode("person" + id);//使用doc.selectSingleNode，则xpath为"//person"+id } long end = System.currentTimeMillis(); long elapsed = end - start; return elapsed; } } 生成测试数据在main方法中只保留generateSampleFileWithDiffName这行，sampleName定义的测试数据位置修改成你本机的合适位置，运行，数据生成，然后注掉该行测试两种工厂的性能循环次数findCount设成1000的时候，IndexedDocumentFactory的方式只需要60多ms即可完成1000次循环查找，而默认工厂类则需要4秒多，随着次数的加大，差距越来越明显。但是，如果修改下程序，selectSingleNode的时候，不是从元素Company开始，而是使用doc对象来selectSingleNode，这时你会发现两者查找起来速度同样慢，1000次循环大概就需要6秒多。IndexedDocumentFactory就没什么用了，起作用的时候，是要查找的元素和查找入口点是直接上下级的关系，如果在company和person元素再加一层元素，比如employees，而入口搜索点仍是company元素，效果会如何呢？这个时候，仍然是不起作用的，即使加大虚拟机最大内存数也没有效果。结论 IndexedDocumentFactory不是灵丹妙药，它起作用是需要一定的条件的，它适合于搜索点和被搜索的元素处于直接上下级的关系，非常适合于我们在程序中一些元素结构一致但类别不要求一致的情况下，保存、查找资源、配置等信息。声明：ITeye文章版权属于作者，受法律保护。没有作者书面许可不得转载。推荐链接
返回顶楼

yzqnow1234 等级: 初级会员性别: 文章: 1 积分: 30 来自: 北京	发表时间：2012-08-03 能否解释什么时间使用 DocumentFactory 效率最高。
返回顶楼	回帖地址 0 0 请登录后投票

theoffspring 等级: 性别: 文章: 218 积分: 190 来自: 大连	发表时间：2012-08-03 yzqnow1234 写道能否解释什么时间使用 DocumentFactory 效率最高。文中不是已经给出答案了吗，仔细看看。
返回顶楼	回帖地址 0 0 请登录后投票

论坛首页 → Java企业应用版

跳转论坛: