python和nodejs之间使用protobuf的序列化问题 [英] Serialization problem using protobuf between python and nodejs

查看:33
本文介绍了python和nodejs之间使用protobuf的序列化问题的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在 python 和 nodejs 之间序列化 protobuf 消息时,我遇到了兼容问题.我有一条如下所示的 protobuf 消息:

message User {保留 2,3;字符串 user_id = 1;int32 硬币 = 4;int32 exp = 5;int32 宝石 = 6;int32 级别 = 7;}

我想序列化一个消息实例,如:

"userId": "3562957934"硬币":350经验":1宝石":301级

当我做 user_pb2.User.SerializeToString()<代码>\x0a\x0a\x33\x35\x36\x32\x39\x35\x37\x39\x33\x34\x20\xde\x02\x28\x01\x30\x1e\x38\x01

或二进制

1101 1110 0000 0010 0010 1000 0000 0001 0011 0000 0001 1110 0011 1000 0000 0001

<小时>

当我尝试在 nodejs 中反序列化此消息时,我得到

"userId": "3562957934"硬币":381经验":1宝石":301级

其中的硬币"值错误

然后我尝试创建一个消息实例(硬币值 = 350)并在 nodejs 中反序列化它.我得到一个不同的二进制文件:<代码>\x5c\x0a\x5c\x0a\x33\x35\x36\x32\x39\x35\x37\x39\x33\x34\x20\xc3\x9e\x02\x28\x01\x30\x1e\x38\x01

或二进制:

1100 0011 1001 1110 0000 0010 0010 1000 0000 0001 0011 0000 0001 1110 0011 1000 0000 p>

我发现除了头部\x0a\x0a\x5c\x0a\x5c\x0a的奇怪字节之外,python和nodejs序列化之间的主要区别是字节 1101 1110 (python) vs 1100 0011 1001 1110 (nodejs),或字符串形式 3562957934 (08 (python) vs <代码>3562957934 Þ(08 (nodejs)

我的协议是:/usr/local/bin/protoc -I=protos user.proto --python_out=pb(python)/usr/local/bin/protoc --js_out=import_style=commonjs,binary:protos user.proto -I=protos(nodejs)

我想给定相同的消息,python 和 nodejs 的序列化应该是相同的,不是吗?我尝试搜索谷歌官方 protobuf 文档,仍然找不到解决方案.有没有人遇到过同样的问题?

解决方案

在传递序列化 blob 时,您似乎遇到了某种 UTF-8 编码问题.原始序列化字节(来自 Python)中有一个字节 0xDE,但您引用的 node.js 版本有 0xC3 0x9E,这是 UTF-8 编码Unicode 代码点 U+00DE.

为了安全起见,我建议您使用 ASCII 安全编码(例如 base64)来传递 blob 以进行调试.一旦成功,您就可以确保以二进制模式打开所有相关文件和流.

I got a compatible problem when serialize a protobuf message between python and nodejs. I have a protobuf message like the one below:

message User {
  reserved 2,3;
  string user_id = 1;
  int32 coin = 4;
  int32 exp = 5;
  int32 gem = 6;
  int32 level = 7;
}

i would like to serialize a message instance like:

"userId": "3562957934"
"coin": 350
"exp": 1
"gem": 30
"level": 1

when I do user_pb2.User.SerializeToString() \x0a\x0a\x33\x35\x36\x32\x39\x35\x37\x39\x33\x34\x20\xde\x02\x28\x01\x30\x1e\x38\x01

or in binary

1101 1110 0000 0010 0010 1000 0000 0001 0011 0000 0001 1110 0011 1000 0000 0001


when I try to deserialize this message in nodejs, I get

"userId": "3562957934"
"coin": 381
"exp": 1
"gem": 30
"level": 1

which has a wrong "coin" value

then I try to create a message instance (with coin value = 350) and deserialize it in nodejs. I get a different binary: \x5c\x0a\x5c\x0a\x33\x35\x36\x32\x39\x35\x37\x39\x33\x34\x20\xc3\x9e\x02\x28\x01\x30\x1e\x38\x01

or in binary:

1100 0011 1001 1110 0000 0010 0010 1000 0000 0001 0011 0000 0001 1110 0011 1000 0000 0001

I found that beside the strange bytes of the head \x0a\x0a and \x5c\x0a\x5c\x0a the main different between the python and nodejs serialization is the byte 1101 1110 (python) vs 1100 0011 1001 1110 (nodejs), or in string form 3562957934 �(08 (python) vs 3562957934 Þ(08 (nodejs)

my protoc are: /usr/local/bin/protoc -I=protos user.proto --python_out=pb(python) /usr/local/bin/protoc --js_out=import_style=commonjs,binary:protos user.proto -I=protos(nodejs)

I suppose that given a same message, serialization of python and nodejs should be the same, didn't it? I tried searching for google official protobuf documents, still cannot found a solution. Does anyone have come across a same problem?

解决方案

It looks like you have some sort of UTF-8 encoding problem when passing around the serialized blobs. The original serialized bytes (from Python) have a byte 0xDE in them, but the node.js version you quote has 0xC3 0x9E instead, which is the UTF-8 encoding of the Unicode code point U+00DE.

I suggest you use an ASCII-safe encoding such as base64 to pass around the blobs for debugging purposes, just to be on the safe side. Once that works, you can make sure that you open all the relevant files and streams in binary mode.

这篇关于python和nodejs之间使用protobuf的序列化问题的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆