`

JDOM 得失

    博客分类:
  • J2SE
阅读更多

A Short History of JDOM

Bill Venners : Tell me about JDOM.

Elliotte Rusty Harold : The convention center we're in now, the Santa Clara convention center, is where JDOM was born three years ago at the very first O'Reilly Enterprise Java conference. Brett McLaughlin, who was then working on O'Reilly's Java and XML book, was giving a talk on DOM. Jason Hunter was in the audience. Jason noticed that about every third slide in Brett's talk was "gotcha!" Something in DOM doesn't work like you would naturally expect it to work. So Jason said to himself, there's got to be a better way. He and Brett went outside and had lunch on the lawn, and in the course of their conversation decided to create what would become JDOM. Over the next couple of weeks, they did some work on it, and I think released their first alpha version to the world.

JDOM, like DOM, is a tree-based object model. It loads the whole document in memory like DOM, but it's much simpler. JDOM uses concrete classes rather than interfaces, which DOM uses, and I saw how that made life simpler. JDOM is designed just for Java. It is not designed to support C++, Python, Perl, or any other language. The interoperability is achieved through XML, not the API. The API doesn't need to port. A Java API runs on one system. The XML document is what moves between systems and needs to be portable. JDOM is in many ways what DOM should have been. It's simple. It's mostly correct. It's easy enough for people who aren't experts in both XML and JDOM to use. To use DOM correctly, you really need to be an expert in both XML and DOM.

JDOM Offers Many Convenience Methods

Bill Venners : In your talk, one complaint you had about JDOM is: "There's more than one way to do it." What's that all about?

Elliotte Rusty Harold : JDOM often provides convenience methods. For example, suppose you have an item element in RSS, and you want to get the content of the title of that item . You can call getChildElement("title").getText() . You can call getChildText() . You can call getChildTextTrim() . You can call getChildTextNormalized() . If you want an attribute, there are still more methods you can call.

JDOM has lots of convenience methods. The idea is that sometimes you want the white space removed from either end of this element text you're reading, sometimes you don't. So they give you methods to do both. Looked at individually, any one of these methods is fine. My concern is that when you add them all up, the sheer number of them becomes intimidating. You can't read the JavaDoc documentation for the Element class in JDOM, without saying, "There's just so much here." It's just too big. I would prefer not to provide so many convenience methods. I would prefer a simpler, smaller API that can be grokked in one sitting, an API all of whose methods you can see in maybe one screen.

JDOM Allows Malformed Documents

Bill Venners : You also complained that JDOM XML documents are not always well-formed. Could you differentiate between well-formed and valid documents, and explain your concerns about JDOM?

Elliotte Rusty Harold : XM L documents must be well-formed. There are, depending on how you count, anywhere from a hundred to several thousand different rules. These "well-formedness" rules are the minimum requirements for an XML document. The rules cover things like what characters are allowed in element names: The letter 'a' is OK. The letter omega is OK. The asterisk character is not OK. White space is not OK. The rules say that every start-tag has to have a matching end-tag. Elements can nest, but they cannot overlap. Processing instructions have the form < , ? , a target, white space, the data, ? , and a > . Comments cannot contain a double hyphen. There are many such rules governing well-formedness of XML documents.

Validity talks about which elements and attributes are allowed where. Well-formedness only talks about the structure of any XML document, irrespective of what the names are. Validity says, we're only going to allow these elements with these names in these positions. Validity is not required. Well-formedness is.

JDOM, and for that matter DOM, allows you to create malformed documents. They do not check everything they can possibly check. For instance, they do not currently check that the text content of a text node does not contain the null character, which is completely illegal in an XML document. Similarly so are vertical tabs, form feeds, and other control characters. So one way you can create a malformed document using either JDOM or DOM, is to pass in a string to the Text constructor that contains some of these control characters. In my opinion, an XML API shouldn't allow that. It shouldn't rely on the programmer who is using the API to know which characters are and are not legal. If a programmer tries to do something illegal that would result in a malformed document, it should stop them by throwing an exception.

Bill Venners : You also mentioned the internal DTD subset in this portion of your talk.

Elliotte Rusty Harold : An XML document's DocType declaration points to its Document Type Definition (DTD). If the DTD is actually contained inside the instance document, between square brackets, then that part of the DTD is called the internal DTD subset. In some cases the internal DTD can also point to an external part, which is why we distinguish internal from external. We merge the two DTD subsets to get the complete DTD. Sometimes the whole DTD is there in the internal DTD subset. Sometimes it's in the external part.

In JDOM, the internal DTD subset is not checked. You could put absolutely any string in there whatsoever, including strings that are totally illegal in an internal DTD subset. For example, you could just put the text of the Declaration of Independence as your internal DTD subset in JDOM, even though that would not be well-formed. It's just another thing that JDOM decided they would not check for well-formedness, because checking the internal DTD subset would be too onerous.

DOM solves that problem in a different way, incidentally. DOM makes the DocType declaration read-only, so it can't be changed at all. Therefore, it can't be changed to something that is malformed.

JDOM Ignores Setter Method Conventions

Bill Venners : How about, setter methods don't return void .

Elliotte Rusty Harold : I learned in JavaBeans that one of the ways you recognize a setter method is that it returns void , as in public void setColor() . You know that method sets the color property, because it follows a naming convention. The name begins with the word set . The first letter in Color is capitalized, and so forth. JDOM follows a different pattern, called method invocation chaining, where for example the setName method on the Element class returns that Element object. To me, that just makes no sense. There's no reason for setter methods to return anything.

Bill Venners : The set methods return this ?

Elliotte Rusty Harold :You might have an element object e in class X , and you call e.setName() , which returns e . From inside the method, yes, it's returning this . From outside the method, it's returning whatever object you invoked it on. That pattern is used, for example, in the new IO library in Java, where I also don't like it. But the designers of JDOM do like it. To me, it does not seem semantically correct. It does not seem to indicate what the method is doing, as opposed to how the method is being used.

DOM Uses Java Collections

Bill Venners : You asked, "Is JDOM too Java centric?"

Elliotte Rusty Harold : When JDOM was designed, Brett and Jason said, we're going to go whole hog. We're not going to invent a separate NodeList class, like DOM does. We're going to use the Java Collections API. We're not going to have a cloneNode method like DOM does. We're going to use the Java clone method. We're going to implement Serializable , because good Java classes implement Serializable . We're going to implement Cloneable . We're going to have equals and hashcode methods—all the nice, normal things Java programmers have learned to love. The problem is, five or six years down the road, we've learned that some of those things aren't so nice. The Cloneable interface is a disaster. Joshua Bloch talks about this in Effective Java, and flat out recommends that people ignore it and implement their own copy constructors instead, just because Cloneable is so poorly designed.

The Serializable interface is useful in some circumstances, but I think in XML the serialization format should be XML, not binary object serialization, so I'm not sure whether that's necessary. And when it comes to the Collections API, that API suffers seriously from two things. One is Java's lack of true generics, i.e., templates to C++ programmers. The other is that Java has primitive data types, and the Collections API can't be used for int s or double s. I'm not so sure that one's relevant, but the first one is. When you expose the children of an Element as a java.util.List , what you're getting back is a list of Object s. Every time you get something out of that List , you have to cast it back to its type. We don't know what it is, so we have to have a big switch block that says, if (o instanceof Element) { e = (Element) o; } , and then you do the same thing for Text , Comment , and ProcessingInstruction , and it gets really messy. DOM, by contrast, does have a different NodeList interface that contains Node s. When you get something out of that list, you know it's a Node . And you've got certain operations you can use on a Node , and often that's all you need. Sometimes you need something more. Sometimes you do need to know whether it's an Element node, an Attribute node, or a Text node. But a lot of times, it's enough to know it's a Node . It's not enough to know that it's an Object .

JDOM Uses Too Many Checked Exceptions

Bill Venners : You also suggested in your talk that JDOM had too many checked exceptions.

Elliotte Rusty Harold : JDOM does check many of the things that can make an XML document malformed, not all of them, but many. For example, you can't have an element name that contains white space. Generally speaking, if JDOM detects a problem, then it throws a checked exception, a JDOMException specifically. That means that when you're writing JDOM code, you have a lot of try catch blocks. Try such and such, catch JDOMException , respond appropriately. As Bruce Eckel has pointed out, a lot of people just write catch JDOMException open close curly brace, and don't actually do anything to handle the failure appropriately.

Perhaps the appropriate response is, instead of throwing a checked exception, to throw RuntimeExceptions . That way it doesn't get in the way of your code. It doesn't make your code any messier. But the signal of the problem is still there if the problem arises. The way Joshua Bloch explains this is that any problem that could possibly be caught in testing should be a RuntimeException , for example, setting the name of an element. That should throw a RuntimeException . Because if you use a bad String for that, you'll catch it in testing, if you have good testing. On the other hand, parsing an external document should not throw a RuntimeException , it should throw a checked exception, because that's going to depend on which document is being passed into your program. Sometimes it is going to be well-formed and sometimes not. There's no way to know that in general, so that's a legitimate checked exception. But I just have come to learn, in a way I didn't understand a few years ago, that many exceptions that are currently checked exceptions should really be RuntimeException s.

Bill Venners : So you think JDOM goes a bit overboard with the checked exceptions.

Elliotte Rusty Harold : Yes, and that's probably my fault. I was the one who in the very early days of JDOM argued most strongly for putting in lots of exceptions and checking everything. There were others who argued against putting in any exceptions at all. I think what we were missing then, was anybody standing in the middle saying, "Hey, guys, RuntimeException s would satisfy both of you at the same time. I just didn't know that then. I've learned from Bruce Eckel and Joshua Bloch.

Will JDOM Remain Class-Based?

Bill Venners : In your talk you asked, "Are JDOM committers committed to classes?" What did you mean by that?

Elliotte Rusty Harold : That's a completely separate issue. I had a conversation with Jason Hunter, one of the two or three committers to the CVS tree for JDOM. Jason said that if JDOM used interfaces rather than classes, then it could be used, for example, as the API for a native XML database. And he thought that was an important use case. And on further reflection, I think I agree with him. There is, perhaps, a need for such an API. However, I also think there's a need for a simple, concrete, class-based API. And I'm just not certain at this point going forward that JDOM will always be a class-based API, that it will be a class-based API when it gets to 1.0. So, I think it's useful to have my little XOM API, which I know is going to be a class-based API.

分享到:
评论

相关推荐

    jdom 下载 jdom 下载

    JDOM,全称为Java Document Object Model,是一种专为Java设计的XML处理库。它提供了一种在Java应用程序中创建、修改和操作XML文档的方法。在本文中,我们将深入探讨JDOM的基本概念、功能、安装与下载,以及如何在...

    jdom的两种版本jar包集合

    **JDOM概述** JDOM(Java Document Object Model)是一个用于处理XML文档的Java库,它提供了在内存中构建和操作XML文档的API。JDOM的主要目标是为Java开发者提供一个高性能、灵活且易于使用的XML处理工具,使得XML...

    jdom-1.1.zip_java jdom_jdom-1.0.jar包_jdom-1.1_jdom-1.1.1.tar.gz

    Java的JDOM库是用于处理XML文档的强大工具,它的全称是Java Document Object Model。JDOM提供了一种高效且方便的方式来创建、修改和操作XML数据,使得Java开发者无需依赖于DOM(Document Object Model)或SAX...

    jdom源码+jdom.jar

    JDOM,全称为Java Document Object Model,是一种专为Java设计的XML处理库。它提供了一种高效、方便的方式来创建、构建、修改和读取XML文档。JDOM的主要优点在于其完全使用Java语言实现,因此与Java平台高度集成,...

    jdom-1.1.zip jdom-1.1.jar jdom.jar jdom库 jdom操作xml

    JDOM,全称为Java Document Object Model,是一种专为Java设计的XML处理库。它提供了一种高效、方便的方式来创建、修改和操作XML文档。在Java应用程序中,JDOM扮演着核心角色,允许开发者以面向对象的方式处理XML...

    jdom-1.1.2&jdom帮助文档

    **JDOM 1.1.2 知识详解** JDOM 是一个专为 Java 设计的 XML 处理库,它的全称是 Java Document Object Model。JDOM 提供了一个基于树形结构的 API,用于创建、操作和读取 XML 文档。在 JDOM 1.1.2 版本中,它提供了...

    java org.jdom 包下载

    Java中的JDOM(Java Document Object Model)是一个用于处理XML文档的库,它提供了一种方便的方式来创建、修改和操作XML数据。JDOM是完全用Java编写的,因此它与Java平台紧密集成,提供了高效且灵活的API来处理XML...

    JDOM1.1(Jdom文档)

    **JDOM1.1:构建XML文档的强大工具** 在IT领域,XML(eXtensible Markup Language)是一种用于存储和传输数据的标记语言,广泛应用于数据交换、配置文件和Web服务。为了方便处理XML文档,Java社区开发了JDOM,一个...

    jdom 读取XML 文件

    JDOM是Java中一个用于处理XML文档的库,它提供了一种高效且方便的方式来创建、读取、修改XML文件。本篇文章将深入探讨如何使用JDOM来读取XML文件。 首先,我们需要了解JDOM的基本概念。JDOM通过构建一棵DOM...

    Jdom教程 Jdom教程.pdf

    ### JDOM教程知识点详解 #### 一、JDOM简介 JDOM(Java Document Object Model)是一种用于处理XML文档的Java库。与SAX和DOM API相比,JDOM提供了更为简洁和直观的方式来创建、读取、修改和输出XML文档。本教程主要...

    jdom.rar内含多个jdom包

    JDOM,全称为Java Document Object Model,是一种专为Java设计的XML处理库。它提供了一种方便、高效的方式来创建、构建、修改和读取XML文档。JDOM的主要目标是简化XML处理,使得开发者能更直观地操作XML数据,而无需...

    Android代码-jdom

    Introduction to the JDOM project Please see the JDOM web site at http://jdom.org/ and GitHub repository at https://github.com/hunterhacker/jdom/ Quick-Start for JDOM See the github wiki for a Primer ...

    JDOM v1.0 API

    **JDOM v1.0 API** 是一个针对Java平台的DOM(Document Object Model)实现,专为处理XML文档而设计。JDOM的目标是提供一个高效、便捷、纯Java的XML处理库,使得开发者能更容易地读写XML数据。在本文中,我们将深入...

    jdom-1.0,jdom解析xml

    **JDOM解析XML详解** XML(Extensible Markup Language)是一种用于标记数据的标准化语言,广泛应用于数据交换、配置文件和Web服务等领域。JDOM是Java领域中专门处理XML的一个库,它提供了一种高效且方便的方式来...

    jdom - 2.0.1

    **JDOM 2.0.1 知识详解** JDOM 是一个专为 Java 平台设计的 XML 处理库,它提供了高效且便捷的方式来处理 XML 文档。JDOM 在 XML 开发领域中占据了一席之地,尤其在需要与 Java 代码紧密集成时,其优势更为明显。...

    JDOM使用详解及实例

    JDOM 是一个专门为Java设计的XML处理库,它结合了SAX和DOM的优点,提供了一种高效且易于使用的API来创建、解析和操作XML文档。JDOM的主要目标是简化XML处理,通过利用Java语言的强大特性,如方法重载、集合和映射,...

    jdom的源代码文档

    《深入解析JDOM:源代码探索之旅》 JDOM,全称Java Document Object Model,是一个专为Java设计的XML处理库。它提供了一个基于Java的API来创建、修改和访问XML文档,使得开发者能更方便地处理XML数据。本文将详细...

    使用JDOM解析XML文件

    在Java编程中,JDOM(Java Document Object Model)是处理XML的一种库,它提供了方便的方式来创建、修改和读取XML文档。本篇文章将深入探讨如何使用JDOM解析XML文件。 首先,我们需要理解JDOM的基本结构。JDOM通过...

    jdom英文帮助文档

    《JDOM英文帮助文档》是Java开发者不可或缺的参考资料,尤其对于初学者而言,它提供了全面且深入的JDOM库知识。JDOM,全称为Java Document Object Model,是Java平台上的一个解析、创建和操作XML文档的API。这篇文档...

    JDOMAPI(html)

    **一、JDOM简介** JDOM是Java Document Object Model的缩写,它是一个针对XML文档的API,完全用Java语言编写,专为Java平台设计。JDOM提供了一种高效且方便的方式来创建、操作和读取XML文档。与DOM(Document ...

Global site tag (gtag.js) - Google Analytics