论坛首页 入门技术论坛

dom4j解析特殊字符出错

浏览 7894 次
精华帖 (0) :: 良好帖 (0) :: 新手帖 (0) :: 隐藏帖 (0)
作者 正文
   发表时间:2007-10-20  
用DocumentHelper.parseText(text)解析的时候,text里面有些特殊字符,比如\0x07, \0x13,这些,就报异常。有什么办法处理嘛?
   发表时间:2007-10-21  
还以为CDATA可以躲过一截,结果还是报错。。。

org.dom4j.DocumentException: Error on line 24 of document  : An invalid XML character (Unicode: 0x13) was found in the CDATA section. Nested exception: An invalid XML character (Unicode: 0x13) was found in the CDATA section.
	at org.dom4j.io.SAXReader.read(SAXReader.java:482)
	at org.dom4j.DocumentHelper.parseText(DocumentHelper.java:278)
	at book.xml.dom4j.StringToDoc.main(StringToDoc.java:32)
Nested exception: 
org.xml.sax.SAXParseException: An invalid XML character (Unicode: 0x13) was found in the CDATA section.
	at org.apache.xerces.util.ErrorHandlerWrapper.createSAXParseException(Unknown Source)
	at org.apache.xerces.util.ErrorHandlerWrapper.fatalError(Unknown Source)
	at org.apache.xerces.impl.XMLErrorReporter.reportError(Unknown Source)
	at org.apache.xerces.impl.XMLErrorReporter.reportError(Unknown Source)
	at org.apache.xerces.impl.XMLScanner.reportFatalError(Unknown Source)
	at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl.scanCDATASection(Unknown Source)
	at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl$FragmentContentDispatcher.dispatch(Unknown Source)
	at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl.scanDocument(Unknown Source)
	at org.apache.xerces.parsers.XML11Configuration.parse(Unknown Source)
	at org.apache.xerces.parsers.XML11Configuration.parse(Unknown Source)
	at org.apache.xerces.parsers.XMLParser.parse(Unknown Source)
	at org.apache.xerces.parsers.AbstractSAXParser.parse(Unknown Source)
	at org.dom4j.io.SAXReader.read(SAXReader.java:465)
	at org.dom4j.DocumentHelper.parseText(DocumentHelper.java:278)
	at book.xml.dom4j.StringToDoc.main(StringToDoc.java:32)
Nested exception: org.xml.sax.SAXParseException: An invalid XML character (Unicode: 0x13) was found in the CDATA section.
	at org.apache.xerces.util.ErrorHandlerWrapper.createSAXParseException(Unknown Source)
	at org.apache.xerces.util.ErrorHandlerWrapper.fatalError(Unknown Source)
	at org.apache.xerces.impl.XMLErrorReporter.reportError(Unknown Source)
	at org.apache.xerces.impl.XMLErrorReporter.reportError(Unknown Source)
	at org.apache.xerces.impl.XMLScanner.reportFatalError(Unknown Source)
	at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl.scanCDATASection(Unknown Source)
	at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl$FragmentContentDispatcher.dispatch(Unknown Source)
	at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl.scanDocument(Unknown Source)
	at org.apache.xerces.parsers.XML11Configuration.parse(Unknown Source)
	at org.apache.xerces.parsers.XML11Configuration.parse(Unknown Source)
	at org.apache.xerces.parsers.XMLParser.parse(Unknown Source)
	at org.apache.xerces.parsers.AbstractSAXParser.parse(Unknown Source)
	at org.dom4j.io.SAXReader.read(SAXReader.java:465)
	at org.dom4j.DocumentHelper.parseText(DocumentHelper.java:278)
	at book.xml.dom4j.StringToDoc.main(StringToDoc.java:32)
Exception in thread "main" java.lang.NullPointerException
	at book.xml.dom4j.StringToDoc.main(StringToDoc.java:36)

0 请登录后投票
   发表时间:2008-10-16  
我也遇到通用的问题,不知道怎么解决才好!
0 请登录后投票
   发表时间:2008-10-16  
解析之前转意
0 请登录后投票
   发表时间:2008-10-17  
但是需要解析的文本是固定生成的,若对某个字符进行转义,可能会将合法的部分也转义,这样就达不到解决效果了啊。
到底如何是好呢?
0 请登录后投票
论坛首页 入门技术版

跳转论坛:
Global site tag (gtag.js) - Google Analytics