是在从客户端发来的SOAPMessage中getEnvelope时出了错
提示错误原因是:Invalid byte 1 of 1-byte UTF-8 sequence.
它解析到了在 1字节UTF-8序列中无效的第一字节
1字节UTF-8序列是怎么样的呢?
One-byte
codes are used only for the ASCII values 0 through 127. In this case
the UTF-8 code has the same value as the ASCII code. The high-order bit
of these codes is always 0.
形式是0xxxxxxx
也就是说它读到的字节最高位是1,因此被认定为是非法。
前提是它认定该字节是UTF-8编码,为什么会认定是UTF-8,可能是默认,也可能是哪里指定,比如xml文件中。
至于凭什么它能认定是1-byte UTF-8 sequence,不是很清楚,可能存在什么预认定机制,或者这个byte对于任意字节的UTF-8的首字节来说都是非法的,它只是表达成这样(但造成歧义了)
结论:xml的编码实际上不是utf-8,可能是gb2312/gbk等,如果以这些编码去读取,也许就不会有这问题,或者传过来时将xml编码固定在utf-8
补充修改一下:
要认定是1-byte UTF-8 sequence还是比较容易认的,只要该字节后就出现了UTF-8 sequence的任意字节首字节,就可以辨识这是一个n-byte UTF-8 sequence.
first byte pattern of 1-byte UTF-8 sequence: 0xxxxxxx
first byte pattern of 2-byte UTF-8 sequence: 110xxxxx
first byte pattern of 3-byte UTF-8 sequence: 1110xxxx
first byte pattern of 4-byte UTF-8 sequence: 11110xxx
对于以下这些异常提示也是同理:
Invalid byte 2 of 2-byte UTF-8 sequence.
Invalid byte 2 of 3-byte UTF-8 sequence.
Invalid byte 2 of 4-byte UTF-8 sequence.
http://topic.csdn.net/u/20120513/13/97af0141-df0d-4758-8fab-f91dd9af01db.html?seed=973731074&r=78553396#r_78553396
http://en.wikipedia.org/wiki/UTF-8
分享到:
相关推荐
2. "xml读取异常Invalid byte 1 of 1-byte UTF-8 sequence-中文版的window下java的默认的编码为GBK.url":这个文件名暗示了解决XML文件读取异常的方法,尤其是在Windows环境下,Java默认的文件编码可能是GBK,这可能...
在利用php解析xml时提示Invalid byte 1 of 1-byte UTF-8 sequence错误了,这个问题我百度查实说是编码问题,结果我把编码处理一下果然KO了,下面我来分享一下解决办法
Ruby是一种面向对象的、动态类型的编程语言,由Yukihiro "Matz" Matsumoto于1995年创建。它的设计目标是让代码更加简洁、优雅,同时提供高度的可扩展性和灵活性。Ruby在软件开发领域,尤其是Web开发中,因其强大的...
本篇文章将深入探讨一个具体的错误:“invalid byte sequence for encoding \"UTF8\": 0x00”,并提供相应的解决方案。 这个错误发生在尝试将包含空字符(0x00)的数据从SQL Server迁移到PostgreSQL时。在SQL ...
UTF-8 strings will probably be safe because UTF-8 does not use control characters such as \n and \r as part of multi-octet encodings. However, there are no guarantees; if you need to be certain, you ...
<?xml version="1.0" encoding="utf-8"?> ``` 如果这一行存在格式问题,如额外的字符、缺失的引号或错误的编码,都可能导致解析错误。有时,即使XML语法没有其他明显错误,这行也可能导致问题。如果删除这行可以...
byte-wise writes of CSRs such as the deviceID register and BAR. - Message response transaction received as a user defined packet type using 16-bit device IDs appears as a corrupted packet on the ...
Shift and Rotate Operations 6.11.5 - Bit Operations and SETcc Instructions 6.11.6 - String Operations 6.11.7 - Conditional Jumps 6.11.8 - CALL and INT Instructions 6.11.9 - Conditional...
PEP 529: Change Windows filesystem encoding to UTF-8 PEP 528: Change Windows console encoding to UTF-8 PEP 520: Preserving Class Attribute Definition Order PEP 468: Preserving Keyword Argument ...
polling, after CP5611 sent a specific sequence of communication errors. Build 219 : Solved problems: Improvements and new functions: - New Parameter "Max Gap (Bytes)" on Setup parameter property...
Disassembly of raw data buffers with byte initialization data now prefixes each output line with the current buffer offset. Disassembly of ASF! table now includes all variable-length data fields at ...
- Now you could enable/disable any of SSEx/AES/MOVBE/SYSENTER_SYSEXIT/XSAVE instruction sets using new CPUID option in .bochsrc. - When x86-64 support is compiled in, you could enable/disable long ...
聊天记录开膛手在 WDI 中,我们共享一切。... 如果您收到错误“in `scan': invalid byte sequence in UTF-8 (ArgumentError)”,只需将您的文本日志解析为可以转换为 UTF-8 的内容(例如 )。 我将来会解决这个问题。
return{key:e.type,data:n}},t._utf8ArrayToStr=function(t){for(var e=void 0,r=void 0,i="",a=0,n=t.length;a<n;){var o=t[a++];switch(o>>4){case 0:return i;case 1:case 2:case 3:case 4:case 5:case 6:case 7:...
**DOM4J DocumentException: Invalid byte 2 of 2-byte UTF-8 sequence** **异常描述:** 当Hibernate尝试解析一个XML配置文件时,如果文件中的某些字符不符合UTF-8编码规则,就会抛出此类异常。 **解决方法:** ...