ld_hust

浏览: 172993 次
性别:
来自: 武汉

最近访客更多访客>>

leon916

Yuvahai

snower1995

sanshaoye7_510

博主相关

博客

微博

相册

留言

关于我

文章分类

社区版块

存档分类

Techniques of Protocol Buffers Developer Guide

博客分类：

C/C++

EXT Python Google Scheme C

Techniques

This page describes some commonly-used design patterns for dealing with Protocol Buffers. You can also send design and usage questions to the Protocol Buffers discussion group.

Streaming Multiple Messages

If you want to write multiple messages to a single file or stream, it is up to you to keep track of where one message ends and the next begins. The Protocol Buffer wire format is not self-delimiting, so protocol buffer parsers cannot determine where a message ends on their own. The easiest way to solve this problem is to write the size of each message before you write the message itself. When you read the messages back in, you read the size, then read the bytes into a separate buffer, then parse from that buffer. (If you want to avoid copying bytes to a separate buffer, check out the CodedInputStream class (in both C++ and Java) which can be told to limit reads to a certain number of bytes.)

Large Data Sets

Protocol Buffers are not designed to handle large messages. As a general rule of thumb, if you are dealing in messages larger than a megabyte each, it may be time to consider an alternate strategy.

That said, Protocol Buffers are great for handling individual messages within a large data set. Usually, large data sets are really just a collection of small pieces, where each small piece may be a structured piece of data. Even though Protocol Buffers cannot handle the entire set at once, using Protocol Buffers to encode each piece greatly simplifies your problem: now all you need is to handle a set of byte strings rather than a set of structures.

Protocol Buffers do not include any built-in support for large data sets because different situations call for different solutions. Sometimes a simple list of records will do while other times you may want something more like a database. Each solution should be developed as a separate library, so that only those who need it need to pay the costs.

Union Types

You may sometimes want to send a message that could be one of several different types. However, protocol buffer parsers cannot necessarily determine the type of a message based on the contents alone. So how do you make sure that the recipient application knows how to decode your message? One solution is to create a wrapper message that has one optional field for each possible message type.

For example, if you have message types Foo, Bar, and Baz, you can combine them with a type like:

message OneMessage {
  // One of the following will be filled in.
  optional Foo foo = 1;
  optional Bar bar = 2;
  optional Baz baz = 3;
}

You may also want to have an enum field that identifies which message is filled in, so that you can switch on it:

message OneMessage {
  enum Type { FOO = 1; BAR = 2; BAZ = 3; }

  // Identifies which field is filled in.
  required Type type = 1;

  // One of the following will be filled in.
  optional Foo foo = 2;
  optional Bar bar = 3;
  optional Baz baz = 4;
}

If you have a very large number of possible types, listing every one of them in your container type may be unwieldy. Instead, you should consider using extensions:

message OneMessage {
  extensions 100 to max;
}

// Elsewhere...
extend OneMessage {
  optional Foo foo_ext = 100;
  optional Bar bar_ext = 101;
  optional Baz baz_ext = 102;
}

Note that you can use the ListFields reflection method (in C++, Java, and Python) to get a list of all fields present in the message, including extensions. You might use this as part of a scheme for registering handlers for diverse message types.

Self-describing Messages

Protocol Buffers do not contain descriptions of their own types. Thus, given only a raw message without the corresponding .proto file defining its type, it is difficult to extract any useful data.

However, note that the contents of a .proto file can itself be represented using protocol buffers. The file src/google/protobuf/descriptor.proto in the source code package defines the message types involved. protoc can output a FileDescriptorSet – which represents a set of .proto files – using the --descriptor_set_outoption. With this, you could define a self-describing protocol message like so:

message SelfDescribingMessage {
  // Set of .proto files which define the type.
  required FileDescriptorSet proto_files = 1;

  // Name of the message type.  Must be defined by one of the files in
  // proto_files.
  required string type_name = 2;

  // The message data.
  required bytes message_data = 3;
}

By using classes like DynamicMessage (available in C++ and Java), you can then write tools which can manipulate SelfDescribingMessages.

All that said, the reason that this functionality is not included in the Protocol Buffer library is because we have never had a use for it inside Google.

分享到：

sed 简介 | Protocol Buffer Basics: C++

2010-12-07 10:39
浏览 1140
评论(0)
分类:企业架构
查看更多

发表评论

您还没有登录,请您登录后再发表评论

最近访客更多访客>>

博主相关

文章分类

社区版块

存档分类

最新评论

Techniques of Protocol Buffers Developer Guide

Techniques

Streaming Multiple Messages

Large Data Sets

Union Types

Self-describing Messages

评论

发表评论

相关推荐

最近访客 更多访客>>

博主相关

文章分类

社区版块

存档分类

最新评论

Techniques of Protocol Buffers Developer Guide

Techniques

Streaming Multiple Messages

Large Data Sets

Union Types

Self-describing Messages

评论

发表评论

相关推荐

小对象的分配技术

Protocol Buffer Basics: C++

Encoding of Protocol Buffers Developer Guide

Style Guide of Protocol Buffers Developer Guide

Language Guide of Protocol Buffers Developer Guide

Overview of Protocol Buffers Developer Guide

字节序

全局变量和局部变量在内存里的区别

JobFlow要继续考虑的问题

C++单例不可继承

C++静态成员与静态成员函数小结

log4cxx编译方法

优秀的框架工具

关于auto_ptr的忠告

snprintf与strncpy效率对比

用Valgrind查找内存泄漏和无效内存访问

头文件互包含于名字空间的相关问题

C++库介绍

C++中头文件相互包含的几点问题

最近访客更多访客>>