`
- 浏览:
139968 次
- 性别:
- 来自:
湖南
-
编码问题 Invalid byte 1 of 1-byte UTF-8 sequence
XML内容实际是以UTF-8编码的,因此造成了包括中文字符的XML文件无法正常阅读,将编码格式改成“GB2312”后就可以正常解析了。<?xml version="1.0" encoding="GB2312"?>
自己的总结:
1、“org.dom4j.DocumentException: Invalid byte 1 of 1-byte UTF-8 sequence.”异常分析和解决:
分析:
该异常由下面的reader.read(file);语句抛出:
SAXReader reader = new SAXReader();
Document doc = reader.read(file);
产生这个异常的原因是:
所读的xml文件实际是GBK或者其他编码的,而xml内容中却用<?xml version="1.0" encoding="utf-8"?>指定编码为utf-8,所以就报异常了!
注释:参考网上的《Java/J2EE中文问题终极解决之道》一文,编码问题原因应该是:操作系统编码为GBK,而xml指定为utf-8,SAXReader使用系统的默认编码GBK,所以存在需要转换编码的问题,也就自然会出现乱码了!解决:让文件编码和java 操作该文件的接口的编码一致;
解决:
情况一:该xml文件由dom4j生成;
解决方法:用 org.dom4j.io.XMLWriter xmlWriter = new org.dom4j.io.XMLWriter(
new FileOutputStream(fileName));
代替
xmlWriter = new XMLWriter(new FileWriter(fileName));
,指定编码为utf-8生成xml文件;
详细参考资料1:
Dom4j 编码问题彻底解决 作者:lonsen
http://www.5inet.net/Develop/Java/036579,Dom4j_BianMaWenDiCheDeJieJue.aspx
情况二:解析从jsp页面中读取到的用户输入的xml描述内容时,reader.read()抛出异常;
解决方法:
调用read前先把xml内容转为utf-8编码:(使用支持编码格式的函数)
public void validate(FacesContext context, UIComponent component, Object obj)
throws ValidatorException {
String xmldescription = (String) obj;
byte[] bytes =xmldescription.getBytes();
RelationXmlParser.isXmlOK("E:\\jiangcm\\templateXMLSchema.xsd",bytes);
……
}
public static boolean isXmlOK(String xsdFile, byte[] tagetXml) throws SAXException, IOException, DocumentException
{
SAXReader reader = new SAXReader();
……
InputStream in = new ByteArrayInputStream(tagetXml);
InputStreamReader utf8In=new InputStreamReader(in,"utf-8");
……
}
方法三、String.getBytes("utf-8")
返回utf-8的字节就可以了
分享到:
Global site tag (gtag.js) - Google Analytics
相关推荐
"xml读取异常Invalid byte 1 of 1-byte UTF-8 sequence-中文版的window下java的默认的编码为GBK.url":这个文件名暗示了解决XML文件读取异常的方法,尤其是在Windows环境下,Java默认的文件编码可能是GBK,这可能...
在利用php解析xml时提示Invalid byte 1 of 1-byte UTF-8 sequence错误了,这个问题我百度查实说是编码问题,结果我把编码处理一下果然KO了,下面我来分享一下解决办法
本篇文章将深入探讨一个具体的错误:“invalid byte sequence for encoding \"UTF8\": 0x00”,并提供相应的解决方案。 这个错误发生在尝试将包含空字符(0x00)的数据从SQL Server迁移到PostgreSQL时。在SQL ...
1. **Ruby用户指南**:这是学习Ruby的基础,它将引导你了解Ruby的基本语法、数据类型、控制结构、函数、类和模块等概念。通过这本指南,你可以掌握如何在Ruby中编写简单的程序,并逐渐深入到更复杂的编程技巧。 2. ...
<?xml version="1.0" encoding="utf-8"?> ``` 如果这一行存在格式问题,如额外的字符、缺失的引号或错误的编码,都可能导致解析错误。有时,即使XML语法没有其他明显错误,这行也可能导致问题。如果删除这行可以...
by-octet string comparison of an octet sequence would produce the same result if a naive octet-by-octet string comparison were done on the UTF-8 encoding of the octet sequence. This is also true of ...
3. **编码问题**:在Windows环境下,使用Ant执行包含UTF-8编码的构建脚本时,可能会遇到`Invalid byte 1 of 1-byte UTF-8 sequence`错误,这通常是因为命令行不支持UTF-8编码。解决办法是将构建脚本改为GBK编码,...
**DOM4J DocumentException: Invalid byte 2 of 2-byte UTF-8 sequence** **异常描述:** 当Hibernate尝试解析一个XML配置文件时,如果文件中的某些字符不符合UTF-8编码规则,就会抛出此类异常。 **解决方法:** ...
Disassembly of raw data buffers with byte initialization data now prefixes each output line with the current buffer offset. Disassembly of ASF! table now includes all variable-length data fields at ...
--enable-sep, --enable-aes, --enable-1g-pages are deprecated and should not be used anymore. - Local APIC configure option --enable-apic is deprecated and should not be used anymore. The LAPIC ...
byte-wise writes of CSRs such as the deviceID register and BAR. - Message response transaction received as a user defined packet type using 16-bit device IDs appears as a corrupted packet on the ...
聊天记录开膛手在 WDI 中,我们共享一切。... 如果您收到错误“in `scan': invalid byte sequence in UTF-8 (ArgumentError)”,只需将您的文本日志解析为可以转换为 UTF-8 的内容(例如 )。 我将来会解决这个问题。
You are visitor as of October 17, 1996. The Art of Assembly Language Programming <br>Forward Why Would Anyone Learn This Stuff? 1 What's Wrong With Assembly Language 2 What's Right With ...
ADC12, Repeated Sequence of Conversions ADC12, Repeated Single Channel Conversions ADC12, Using 10 External Channels for Conversion ADC12, Sequence of Conversions (non-repeated) ADC12, Sample A10 Temp...
Doctest:测试交互式Haskell示例doctest是一个小程序,用于检查。 它与相似,。安装可以从获得doctest 。 通过键入以下内容进行安装: cabal install doctest确保Cabal的bindir在您的PATH 。 在Linux上: export ...
* added support for MKV "SRT/UTF8", "SRT/ASCII", "ASS" and "SSA" subtitles * increased some internal buffers to avoid AC3 overflow in the "thd ac3 joiner" * fixed: frame counting didn't work for MKV ...
- Support for S7-200 with CP 243-1 was added. Solved problems: - Passing of invalid OPC Item IDs caused a memory leak of the driver's global memory. After the global memory was exhausted, the ...