用于从流中读取多个 protobuf 消息的 python 示例 [英] python example for reading multiple protobuf messages from a stream

查看:60
本文介绍了用于从流中读取多个 protobuf 消息的 python 示例的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在处理来自 spinn3r 的数据,它由序列化为字节流的多个不同的 protobuf 消息组成:

I'm working with data from spinn3r, which consists of multiple different protobuf messages serialized into a byte stream:

http://code.google.com/p/spinn3r-client/wiki/Protostream

protostream 是协议缓冲区消息的流,根据 Google 协议缓冲区规范在网络上编码为长度前缀 varint.该流由三部分组成:标头、有效载荷和尾标记."

"A protostream is a stream of protocol buffer messages, encoded on the wire as length prefixed varints according to the Google protocol buffer specification. The stream has three parts: a header, the payload, and a tail marker."

这似乎是 protobufs 的一个非常标准的用例.事实上,protobuf 核心发行版为 C++ 和 Java 都提供了 CodedInputStream.但是,protobuf 似乎没有为 python 提供这样的工具——内部"工具不是为这种外部使用设置的:

This seems like a pretty standard use case for protobufs. In fact, protobuf core distribution provides CodedInputStream for both C++ and Java. But, it appears that protobuf does not provide such a tool for python -- the 'internal' tools are not setup for this kind of external use:

https://groups.google.com/forum/?fromgroups#!topic/protobuf/xgmUqXVsK-o

所以...在我开始拼凑一个python varint解析器和用于解析不同消息类型流的工具之前:有人知道任何工具吗?

So... before I go and cobble together a python varint parser and tools for parsing a stream of different message types: does anyone know of any tools for this?

为什么 protobuf 中缺少它?(还是我没找到?)

Why is it missing from protobuf? (Or am I just failing to find it?)

这对于 protobuf 来说似乎是一个很大的差距,尤其是与 thrift 的传输"和协议"等价工具相比时.我看对了吗?

This seems like a big gap for protobuf, especially when compared to thrift's equivalent tools for both 'transport' and 'protocol'. Am I viewing that correctly?

推荐答案

看起来另一个答案中的代码可能来自 62dbec86"breel.在使用这个文件之前检查许可证,但我设法让它使用如下代码读取 varint32s:

It looks like the code in the other answer is potentially lifted from here. Check the licence before using this file but I managed to get it to read varint32s using code such as this:

import sys
import myprotocol_pb2 as proto
import varint # (this is the varint.py file)

data = open("filename.bin", "rb").read() # read file as string
decoder = varint.decodeVarint32          # get a varint32 decoder
                                         # others are available in varint.py

next_pos, pos = 0, 0
while pos < len(data):
    msg = proto.Msg()                    # your message type
    next_pos, pos = decoder(data, pos)
    msg.ParseFromString(data[pos:pos + next_pos])

    # use parsed message

    pos += next_pos
print "done!"

这是一个非常简单的代码,旨在加载由 varint32 分隔的单一类型的消息,描述下一条消息的大小.

This is very simple code designed to load messages of a single type delimited by varint32s which describe the next message's size.

更新:也可以使用以下方法直接从 protobuf 库中包含此文件:

Update: It may also be possible to include this file directly from the protobuf library by using:

from google.protobuf.internal.decoder import _DecodeVarint32

这篇关于用于从流中读取多个 protobuf 消息的 python 示例的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆