1.
Varints are a method of serializing
integers using one or more bytes. Smaller numbers take a smaller number of
bytes. Each byte in a varint, except the last byte, has the most significant
bit
(msb) set – this indicates that there are further bytes to come. The
lower 7 bits of each byte are used to store the two's complement representation
of the number in groups of 7 bits, least significant group first. 1
-->
00000001 , 300
-->
10101100 00000010.
2.
The binary version of a message
just uses the field's number as the key – the name and declared type for each
field can only be determined on the decoding end by referencing the message
type's definition. When a message is encoded, the keys and values are
concatenated into a byte stream.
3.
When the message is being
decoded, the parser needs to be able to skip fields that it doesn't recognize.
The "key" for each pair in a wire-format message is actually two
values – the field number from your .proto file, plus a wire type
that
provides just enough information to find the length of the following value:
Type
|
Meaning
|
Used For
|
0
|
Varint
|
int32, int64, uint32, uint64, sint32,
sint64, bool, enum
|
|
|
|
1
|
64-bit
|
fixed64, sfixed64, double
|
|
|
|
2
|
Length-delimited
|
string, bytes, embedded messages, packed
repeated fields
|
|
|
|
3
|
Start group
|
groups (deprecated)
|
|
|
|
4
|
End group
|
groups (deprecated)
|
|
|
|
5
|
32-bit
|
fixed32, sfixed32, float
|
|
|
|
Each key in the streamed message is a varint with the value
(field_number << 3) | wire_type
– in other words, the last three bits of
the number store the wire type.
4.
There is an important
difference between the signed int types (sint32
and sint64
) and the
"standard" int types (int32
and int64
) when it comes to encoding
negative numbers. If you use int32
or int64
as the type for a negative number,
the resulting varint is always ten bytes long – it is, effectively, treated
like a very large unsigned integer. If you use one of the signed types, the
resulting varint uses ZigZag encoding, which is much more efficient.
5.
ZigZag encoding maps signed
integers to unsigned integers so that numbers with a small absolute value
(for instance, -1) have a small varint encoded value too. It does this in a way
that "zig-zags" back and forth through the positive and negative
integers, so that -1 is encoded as 1, 1 is encoded as 2, -2 is encoded as 3,
and so on.
6.
Non-varint numeric types are
stored in little-endian byte order.
7.
A wire type of 2
(length-delimited) means that the value is a varint encoded length. The tag number and wire type are followed by
the specified number of bytes of data.
8.
If your message definition has repeated
elements (without the [packed=true]
option), the encoded message has zero or
more key-value pairs with the same tag number. These repeated values do not
have to appear consecutively; they may be interleaved with other fields. The
order of the elements with respect to each other is preserved when parsing,
though the ordering with respect to other fields is lost.
9.
Normally, an encoded message
would never have more than one instance of an optional or required field.
However, parsers are expected to handle the case in which they do. For numeric
types and strings, if the same value appears multiple times, the parser accepts
the last value it sees. For embedded message fields, the parser merges multiple
instances of the same field, as if with the Message.MergeFrom
method – that is,
all singular scalar fields in the latter instance replace those in the former,
singular embedded messages are merged, and repeated fields are concatenated.
The effect of these rules is that parsing the concatenation of two encoded
messages produces exactly the same result as if you had parsed the two messages
separately and merged the resulting objects:
MyMessage message;
message.ParseFromString(str1 + str2);
is equivalent to this:
MyMessage message, message2;
message.ParseFromString(str1);
message2.ParseFromString(str2);
message.MergeFrom(message2);
10.
A packed repeated field
containing zero elements does not appear in the encoded message. Otherwise, all
of the elements of the field are packed into a single key-value pair with wire
type 2 (length-delimited). Each element is encoded the same way it would be
normally, except without a tag preceding it. Only repeated fields of primitive
numeric types (types which use the varint, 32-bit, or 64-bit wire types) can be
declared "packed".
11. W
hile you can use field numbers
in any order in a
.proto
, when a message is serialized its known fields should be written
sequentially by field number. This allows parsing code to use optimizations
that rely on field numbers being in sequence. However, protocol buffer parsers
must be able to parse fields in any order, as not all messages are created by
simply serializing an object – for instance, it's sometimes useful to merge two
messages by simply concatenating them.
12.
If a message has unknown
fields
, the current Java implementations write them in arbitrary order
after the sequentially-ordered known fields.
分享到:
相关推荐
《Visual Studio FileEncoding插件:提升代码编辑体验的利器》 在软件开发过程中,文件编码格式的选择和管理是不可忽视的重要环节。尤其是在处理跨平台或多语言项目时,正确的编码格式能确保代码的可读性和兼容性。...
**谷歌设置编码插件SetCharacterEncoding详解** 在日常的网页浏览和开发过程中,我们经常会遇到网页内容编码不正确的问题,导致乱码现象。为了解决这个问题,开发者们创建了一款名为"SetCharacterEncoding"的谷歌...
《Gma.QrCodeNet.Encoding.Net35.dll与Gma.QrCodeNet.Encoding.Net45.dll:二维码编码库解析》 在信息化飞速发展的今天,二维码作为一种高效的信息载体,已经广泛应用在我们的生活中。无论是产品包装、广告宣传还是...
《Eclipse文件转码插件:com.lifesting.tool.encoding_1.0.0.jar解析》 在IT行业中,开发工具的效率与便利性对于程序员来说至关重要。Eclipse作为一款广泛应用的Java集成开发环境(IDE),其丰富的插件库使得开发者...
### Auto-Encoding Variational Bayes (AEVB) #### 概述 《Auto-Encoding Variational Bayes》是一篇关于高效地在存在连续隐变量的有向概率模型中进行推断和学习的研究论文。该文由Diederik P. Kingma和Max ...
《Gma.QrCodeNet.Encoding库的全面解析与应用指南》 Gma.QrCodeNet.Encoding是一个用于生成和解码二维码的.NET库,它支持多种.NET框架版本,包括从2.0到4.5,以及.NET Core 4.5。这个库以其高效、灵活和易于使用的...
标题中的“Set Character Encoding_0.51.zip”指的是一个版本为0.51的名为“Set Character Encoding”的软件插件的压缩包文件。这个插件是专为谷歌浏览器(Google Chrome)设计的,其主要功能是允许用户手动调整...
4. 配置Apache服务器,将 `mod_encoding` 加入到加载模块的配置中,如在 `httpd.conf` 文件中添加 `LoadModule encoding_module modules/mod_encoding.so`。 5. 重启Apache服务器使更改生效,可以使用 `sudo service...
赠送jar包:parquet-encoding-1.8.2.jar; 赠送原API文档:parquet-encoding-1.8.2-javadoc.jar; 赠送源代码:parquet-encoding-1.8.2-sources.jar; 赠送Maven依赖信息文件:parquet-encoding-1.8.2.pom; 包含...
"mod_encoding_2010.zip"这个压缩包文件,显然与Apache服务器的一个特定模块——mod_encoding有关,该模块主要解决的是Apache在处理包含非ASCII字符(如中文)的URL路径时的问题。 Apache默认情况下,可能会对包含...
针对中文网址(中文URL)的支持问题,"64位环境的mod_encoding模块"提供了一个解决方案。这个模块是专门为了解决在64位操作系统,如64位CentOS 5.5上,Apache2.2.15版本对中文URL处理不兼容的问题而设计的。 Apache...
在IT领域,编码(Encoding)是数据转换成可读格式的过程,特别是在文本处理中,它涉及到将字符转换为数字表示,以便计算机可以处理和存储。本文将深入探讨C#编程语言中的编码概念以及如何使用“Encoding内码查看工具...
《字符编码转换器(Encoding Tool)——深入理解与应用》 字符编码是计算机处理文本的基础,不同的编码方式决定了如何存储和显示各种语言的字符。在信息化社会中,由于全球化的需求,我们常常需要面对不同编码格式...
### Encoding类使用说明 #### 一、概述 在.NET Framework中,`System.Text.Encoding`类提供了处理字符编码的强大工具。编码是指将Unicode字符转换为字节序列的过程;而解码则是相反的操作,即将字节序列转换回...
描述中提到的"NoSuchMethodError setCharacterEncoding(Ljava/lang/String;)V"是一个Java运行时异常,意味着在类装载时尝试调用的方法在该类的Class文件中存在,但在链接阶段找不到。这通常发生在试图执行的方法在...
LoadModule encoding_module modules/mod_encoding.so Header add MS-Author-Via "DAV" <IfModule mod_encoding.c> EncodingEngine on NormalizeUsername on SetServerEncoding GBK ...
`mod_encoding`模块是Apache的一个扩展,专门设计来解决这个问题,使得Apache能够正确地识别和处理包含多语言字符的URL和文件路径。 ### 1. `mod_encoding`模块介绍 `mod_encoding`是Apache HTTP服务器的一个模块...
"com.lifesting.tool.encoding_1.0.0.jar及源码" 是一个专为解决此问题而设计的工具,它是一个Eclipse插件,用于帮助开发者批量转换Eclipse项目中的文件编码格式。这个插件的名字揭示了其功能核心,即`...