`
zyjwy02
  • 浏览: 141207 次
  • 性别: Icon_minigender_1
  • 来自: 广州
社区版块
存档分类
最新评论

How to Validate XML using Java

阅读更多

Configure Java APIs (SAX, DOM, dom4j, XOM) using JAXP 1.3 to validate XML Documents with DTD and Schema(s).

Many Java XML APIs provide mechanisms to validate XML documents, the JAXP API can be used for most of these XML APIs but subtle configuration differences exists. This article shows five ways of how to configure different Java APIs (including DOM, SAX, dom4j and XOM) using JAXP 1.3 for checking and validating XML with DTD and Schema(s).

Setup

All underlying examples can be compiled and executed using Java 5.0 (JAXP 1.3) or higher and make use of the following components and settings.

 

 

Error Handler

To report errors, it is necessary to provide an ErrorHandler to the underlying implementation. The ErrorHandler used for the examples is a very simple one which reports the error to System.out and continues until the XML document has been fully parsed or until a fatal-error has been reported.

public class SimpleErrorHandler implements ErrorHandler {
    public void warning(SAXParseException e) throws SAXException {
        System.out.println(e.getMessage());
    }

    public void error(SAXParseException e) throws SAXException {
        System.out.println(e.getMessage());
    }

    public void fatalError(SAXParseException e) throws SAXException {
        System.out.println(e.getMessage());
    }
}

Checking Wellformed-ness

Before a document can be called XML and not csv, simple text or any other format, it needs to support the basic rules as defined by the XML Recommendation, when it adheres to these rules it is said to be Wellformed XML.

Code Fragments: DOM, SAX, dom4j, XOM

DOM

DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
factory.setValidating(false);
factory.setNamespaceAware(true);

DocumentBuilder builder = factory.newDocumentBuilder();

builder.setErrorHandler(new SimpleErrorHandler());

Document document = builder.parse(new InputSource("document.xml"));

SAX

SAXParserFactory factory = SAXParserFactory.newInstance();
factory.setValidating(false);
factory.setNamespaceAware(true);

SAXParser parser = factory.newSAXParser();

XMLReader reader = parser.getXMLReader();
reader.setErrorHandler(new SimpleErrorHandler());
reader.parse(new InputSource("document.xml"));

dom4j

SAXReader reader = new SAXReader();
reader.setValidation(false);
reader.setErrorHandler(new SimpleErrorHandler());
reader.read("contacts.xml");

XOM

SAXParserFactory factory = SAXParserFactory.newInstance();
factory.setValidating(false);
factory.setNamespaceAware(true);

SAXParser parser = factory.newSAXParser();

XMLReader reader = parser.getXMLReader();
reader.setErrorHandler(new SimpleErrorHandler());

Builder builder = new Builder(reader);
builder.build("contacts.xml");

Validate using internal DTD

Parse the input document using only the DTD (contacts.dtd), as defined by the DOCTYPE in the input document, for validation.

Code Fragments: DOM, SAX, dom4j, XOM

DOM

DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
factory.setValidating(true);
factory.setNamespaceAware(true);

DocumentBuilder builder = factory.newDocumentBuilder();

builder.setErrorHandler(new SimpleErrorHandler());

Document document = builder.parse(new InputSource("document.xml"));

SAX

SAXParserFactory factory = SAXParserFactory.newInstance();
factory.setValidating(true);
factory.setNamespaceAware(true);

SAXParser parser = factory.newSAXParser();

XMLReader reader = parser.getXMLReader();
reader.setErrorHandler(new SimpleErrorHandler());
reader.parse(new InputSource("document.xml"));

dom4j

SAXReader reader = new SAXReader();
reader.setValidation(true);
reader.setErrorHandler(new SimpleErrorHandler());
reader.read("contacts.xml");

XOM

SAXParserFactory factory = SAXParserFactory.newInstance();
factory.setValidating(true);
factory.setNamespaceAware(true);

SAXParser parser = factory.newSAXParser();

XMLReader reader = parser.getXMLReader();
reader.setErrorHandler(new SimpleErrorHandler());

Builder builder = new Builder(reader);
builder.build("contacts.xml");

Validate using internal XSD

Parse the input document using only the XML Schema (contacts.xsd), as defined by the noNamespaceSchemaLocation attribute in the input document, for validation.

Code Fragments: DOM, SAX, dom4j, XOM

DOM

DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
factory.setValidating(true);
factory.setNamespaceAware(true);
factory.setAttribute("http://java.sun.com/xml/jaxp/properties/schemaLanguage", 
      "http://www.w3.org/2001/XMLSchema");

DocumentBuilder builder = factory.newDocumentBuilder();

builder.setErrorHandler(new SimpleErrorHandler());

Document document = builder.parse(new InputSource("document.xml"));

SAX

SAXParserFactory factory = SAXParserFactory.newInstance();
factory.setValidating(true);
factory.setNamespaceAware(true);

SAXParser parser = factory.newSAXParser();
parser.setProperty("http://java.sun.com/xml/jaxp/properties/schemaLanguage", 
      "http://www.w3.org/2001/XMLSchema");

XMLReader reader = parser.getXMLReader();
reader.setErrorHandler(new SimpleErrorHandler());
reader.parse(new InputSource("document.xml"));

dom4j

SAXParserFactory factory = SAXParserFactory.newInstance();
factory.setValidating(true);

SAXParser parser = factory.newSAXParser();
parser.setProperty("http://java.sun.com/xml/jaxp/properties/schemaLanguage", 
      "http://www.w3.org/2001/XMLSchema");

SAXReader reader = new SAXReader(parser.getXMLReader());
reader.setValidation(true);
reader.setErrorHandler(new SimpleErrorHandler());
reader.read("contacts.xml");

XOM

SAXParserFactory factory = SAXParserFactory.newInstance();
factory.setValidating(true);
factory.setNamespaceAware(true);

SAXParser parser = factory.newSAXParser();
parser.setProperty("http://java.sun.com/xml/jaxp/properties/schemaLanguage", 
      "http://www.w3.org/2001/XMLSchema");

XMLReader reader = parser.getXMLReader();
reader.setErrorHandler(new SimpleErrorHandler());

Builder builder = new Builder(reader);
builder.build("contacts.xml");

Validate using external Schema

Parse the input document using the schema (contacts.xsd), as defined externally by the source-code.

Code Fragments: DOM, SAX, dom4j, XOM

DOM

DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
factory.setValidating(false);
factory.setNamespaceAware(true);

SchemaFactory schemaFactory = 
    SchemaFactory.newInstance("http://www.w3.org/2001/XMLSchema");

factory.setSchema(schemaFactory.newSchema(
    new Source[] {new StreamSource("contacts.xsd")}));

DocumentBuilder builder = factory.newDocumentBuilder();

builder.setErrorHandler(new SimpleErrorHandler());

Document document = builder.parse(new InputSource("document.xml"));

SAX

SAXParserFactory factory = SAXParserFactory.newInstance();
factory.setValidating(false);
factory.setNamespaceAware(true);

SchemaFactory schemaFactory = 
    SchemaFactory.newInstance("http://www.w3.org/2001/XMLSchema");

factory.setSchema(schemaFactory.newSchema(
    new Source[] {new StreamSource("contacts.xsd")}));

SAXParser parser = factory.newSAXParser();

XMLReader reader = parser.getXMLReader();
reader.setErrorHandler(new SimpleErrorHandler());
reader.parse(new InputSource("document.xml"));

dom4j

SAXParserFactory factory = SAXParserFactory.newInstance();

SchemaFactory schemaFactory = 
    SchemaFactory.newInstance("http://www.w3.org/2001/XMLSchema");

factory.setSchema(schemaFactory.newSchema(
    new Source[] {new StreamSource("contacts.xsd")}));

SAXParser parser = factory.newSAXParser();

SAXReader reader = new SAXReader(parser.getXMLReader());
reader.setValidation(false);
reader.setErrorHandler(new SimpleErrorHandler());
reader.read("contacts.xml");

XOM

SAXParserFactory factory = SAXParserFactory.newInstance();
factory.setValidating(false);
factory.setNamespaceAware(true);

SchemaFactory schemaFactory = 
    SchemaFactory.newInstance("http://www.w3.org/2001/XMLSchema");
factory.setSchema(schemaFactory.newSchema(
    new Source[] {new StreamSource("contacts.xsd")}));

SAXParser parser = factory.newSAXParser();
XMLReader reader = parser.getXMLReader();
reader.setErrorHandler(new SimpleErrorHandler());

Builder builder = new Builder(reader);
builder.build("contacts.xml");

Validate using internal DTD and external Schema

Parse the input document using the schema (contacts.xsd), as defined externally by the source-code and the DTD (contacts.dtd), as defined by the DOCTYPE in the input document, for validation.

Code Fragments: DOM, SAX, dom4j, XOM

DOM

DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
factory.setValidating(true);
factory.setNamespaceAware(true);

SchemaFactory schemaFactory = 
    SchemaFactory.newInstance("http://www.w3.org/2001/XMLSchema");

factory.setSchema(schemaFactory.newSchema(
    new Source[] {new StreamSource("contacts.xsd")}));

DocumentBuilder builder = factory.newDocumentBuilder();

builder.setErrorHandler(new SimpleErrorHandler());

Document document = builder.parse(new InputSource("document.xml"));

SAX

SAXParserFactory factory = SAXParserFactory.newInstance();
factory.setValidating(true);
factory.setNamespaceAware(true);

SchemaFactory schemaFactory = 
    SchemaFactory.newInstance("http://www.w3.org/2001/XMLSchema");

factory.setSchema(schemaFactory.newSchema(
    new Source[] {new StreamSource("contacts.xsd")}));

SAXParser parser = factory.newSAXParser();

XMLReader reader = parser.getXMLReader();
reader.setErrorHandler(new SimpleErrorHandler());
reader.parse(new InputSource("document.xml"));

dom4j

SAXParserFactory factory = SAXParserFactory.newInstance();

SchemaFactory schemaFactory = 
    SchemaFactory.newInstance("http://www.w3.org/2001/XMLSchema");

factory.setSchema(schemaFactory.newSchema(
    new Source[] {new StreamSource("contacts.xsd")}));

SAXParser parser = factory.newSAXParser();

SAXReader reader = new SAXReader(parser.getXMLReader());
reader.setValidation(true);
reader.setErrorHandler(new SimpleErrorHandler());
reader.read("contacts.xml");

XOM

SAXParserFactory factory = SAXParserFactory.newInstance();
factory.setValidating(true);
factory.setNamespaceAware(true);

SchemaFactory schemaFactory = 
    SchemaFactory.newInstance("http://www.w3.org/2001/XMLSchema");
factory.setSchema(schemaFactory.newSchema(
    new Source[] {new StreamSource("contacts.xsd")}));

SAXParser parser = factory.newSAXParser();
XMLReader reader = parser.getXMLReader();
reader.setErrorHandler(new SimpleErrorHandler());

Builder builder = new Builder(reader);
builder.build("contacts.xml");

Conclusion

Several mechanisms and XML APIs can be used to parse and validate XML, by using JAXP 1.3 the mechanism can mostly stay the same for these different APIs.

Sample Code

Download any of the archives to get the full source-code for the examples above.

 

The archives consist of the ./contacts.xml input XML file, ./contacts.xsd the XML Schema document, ./contacts.dtd the DTD document and the source-code for the fragments above, located in the ./src directory.

The archive also contains a number XML Hammer validation projects included in the ./xmlhammer-projects directory. To be able to execute these XML Hammer projects, you will need to have the XML Hammer application installed. This can be downloaded from:

http://www.xmlhammer.org/downloads.html.

分享到:
评论

相关推荐

Global site tag (gtag.js) - Google Analytics