最大序列化 Protobuf 消息大小 [英] Maximum serialized Protobuf message size

查看:274
本文介绍了最大序列化 Protobuf 消息大小的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

有没有办法获取某个protobuf消息序列化后的最大大小?

Is there a way to get the maximal size of a certain protobuf message after it will be serialized?

我指的是不包含重复"元素的消息.

I'm referring to messages that don't contain "repeated" elements.

请注意,我不是指的是具有特定内容的 protobuf 消息的大小,而是指它可以达到的最大可能大小(在最坏的情况).

Note that I'm not referring to the size of a protobuf message with a specific content, but to the maximum possible size that it can get to (in the worst case).

推荐答案

一般来说,由于存在未知字段的可能性,任何 Protobuf 消息都可以是任意长度.

In general, any Protobuf message can be any length due to the possibility of unknown fields.

如果您接收一条消息,则不能对长度做出任何假设.

If you are receiving a message, you cannot make any assumptions about the length.

如果您要发送一条您自己构建的消息,那么您或许可以假设它只包含您知道的字段——但话又说回来,您也可以轻松地计算出确切的消息大小这种情况.

If you are sending a message that you built yourself, then you can perhaps assume that it only contains fields you know about -- but then again, you can also easily compute the exact message size in this case.

因此,通常询问最大尺寸是多少没有用.

Thus it's usually not useful to ask what the maximum size is.

话虽如此,您可以编写使用 Descriptor 接口迭代消息类型的 FieldDescriptor 的代码 (MyMessageType::descriptor()).

With that said, you could write code that uses the Descriptor interfaces to iterate over the FieldDescriptors for a message type (MyMessageType::descriptor()).

请参阅:https://developers.google.com/protocol-buffers/docs/reference/cpp/google.protobuf.descriptor

Java、Python 和其他可能存在的类似接口.

Similar interfaces exist in Java, Python, and probably others.

以下是要实施的规则:

每个字段由一个标签和一些数据组成.

Each field is composed of a tag followed by some data.

对于标签:

  • 字段编号 1-15 有一个 1 字节的标签.
  • 字段编号 16 及以上具有 2 字节标签.

对于数据:

  • bool 总是一个字节.
  • int32int64uint64sint64 的最大数据长度为 10 个字节(是的,不幸的是,int32 可以是 10 个字节,如果它是负数的话.
  • sint32uint32 的最大数据长度为 5 个字节.
  • fixed32sfixed32float 始终正好是 4 个字节.
  • fixed64sfixed64double 总是正好是 8 个字节.
  • 枚举类型字段的最大长度取决于最大枚举值:
    • 0-127:1 个字节
    • 128-16384:2 个字节
    • ...它是每字节 7 位,但希望您的枚举不是那么大!
    • 另请注意,负值将被编码为 10 个字节,但希望没有.
    • bool is always one byte.
    • int32, int64, uint64, and sint64 have a maximum data length of 10 bytes (yes, int32 can be 10 bytes if it is negative, unfortunately).
    • sint32 and uint32 have a maximum data length of 5 bytes.
    • fixed32, sfixed32, and float are always exactly 4 bytes.
    • fixed64, sfixed64, and double are always exactly 8 bytes.
    • Enum-typed fields' maximum length depends on the maximum enum value:
      • 0-127: 1 byte
      • 128-16384: 2 bytes
      • ... it's 7 bits per byte, but hopefully your enum isn't THAT big!
      • Also note that negative values will be encoded as 10 bytes, but hopefully there aren't any.

      如果您的消息包含以下任何一项,则其最大长度不受限制:

      If your message contains any of the following, then its maximum length is unbounded:

      • stringbytes 类型的任何字段.(除非你知道它们的最大长度,在这种情况下,就是最大长度加上一个长度前缀,就像子消息一样.)
      • 任何重复的字段.(除非你知道它的最大长度,在这种情况下,列表的每个元素都有一个最大长度,就好像它是一个独立的字段,包括标签.这里没有总长度前缀.除非你使用 [packed=true],在这种情况下,您必须查找详细信息.)
      • 扩展.
      • Any field of type string or bytes. (Unless you know their max length, in which case, it's that max length plus a length prefix, like with sub-messages.)
      • Any repeated field. (Unless you know its max length, in which case, each element of the list has a max length as if it were a free-standing field, including tag. There is NO overall length prefix here. Unless you are using [packed=true], in which case you'll have to look up the details.)
      • Extensions.

      这篇关于最大序列化 Protobuf 消息大小的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆