带有一些字符串的 Protobuf InvalidProtocolBufferException [英] Protobuf InvalidProtocolBufferException with some strings

查看:641
本文介绍了带有一些字符串的 Protobuf InvalidProtocolBufferException的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我们使用 protobuf v.3 通过 HTTP 将消息从 C# 客户端传输到 Java 服务器.

We using protobuf v.3 to transfer messages from C# client to Java server over HTTP.

消息原型如下所示:

message CLIENT_MESSAGE {
    string message = 1;
}

客户端和服务器都对字符串使用 UTF-8 字符编码.

Both client and server uses UTF-8 character encoding for strings.

当我们使用像abc"这样的短字符串值时一切都很好,但是当我们尝试传输包含 198 个字符的字符串时,我们捕获了一个异常:

Everything is fine whe we are using short string values like "abc", but when we trying to transfer string with 198 chars in it, we catchig an Exception:

   com.google.protobuf.InvalidProtocolBufferException: 
    While parsing a protocol message, the input ended unexpectedly in the middle of a field. This could mean either that the input has been truncated or that an embedded message misreported its own length.

我们尝试比较包含 protobuf 数据的偶数字节数组,但没有找到解决方案.对于aaa"字符串字节数组以此字节开头:

We tried to compare even byte array containing protobuf data, and didn't found a solution. For "aaa" string byte array starts with this bytes:

10 3 97 97 97

10 3 97 97 97

其中 10 是 protobuf 字段编号,3 是字符串长度,69 65 67 是aaa".

Where 10 is protobuf field number, and 3 is string length, 69 65 67 is "aaa".

对于字符串

aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa"

"aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa"

其中包含 198 个字符,字节数组以此开头:

which contains 198 characters in it, byte array starts with this:

10 198 1 97 97 97....

其中10是protobuf字段编号,198是字符串长度,1好像是字符串标识符,还是什么?

Where 10 is protobuf field number, and 198 is string length, and 1 seems to be like string identifier, or what?

为什么 protobuf 无法解析此消息?

And why protobuf cannot parse this message?

已经花了将近一天的时间寻找这个问题的解决方案,感谢任何帮助.

Already spent almost a day on looking for solution for this problem, any help appreciated.

更新:

我们从客户端和服务器都进行了转储,奇怪的是 - 转储是不同的!

We made dumps both from client and server, and what is weird - the dumps is different!

从客户端转储Protobuf,然后发送到服务器:

Protobuf dump from client, before sending to server:

00000000   0A C6 01 61 61 61 61 61  61 61 61 61 61 61 61 61   ·Æ·aaaaaaaaaaaaa
00000010   61 61 61 61 61 61 61 61  61 61 61 61 61 61 61 61   aaaaaaaaaaaaaaaa
00000020   61 61 61 61 61 61 61 61  61 61 61 61 61 61 61 61   aaaaaaaaaaaaaaaa
00000030   61 61 61 61 61 61 61 61  61 61 61 61 61 61 61 61   aaaaaaaaaaaaaaaa
00000040   61 61 61 61 61 61 61 61  61 61 61 61 61 61 61 61   aaaaaaaaaaaaaaaa
00000050   61 61 61 61 61 61 61 61  61 61 61 61 61 61 61 61   aaaaaaaaaaaaaaaa
00000060   61 61 61 61 61 61 61 61  61 61 61 61 61 61 61 61   aaaaaaaaaaaaaaaa
00000070   61 61 61 61 61 61 61 61  61 61 61 61 61 61 61 61   aaaaaaaaaaaaaaaa
00000080   61 61 61 61 61 61 61 61  61 61 61 61 61 61 61 61   aaaaaaaaaaaaaaaa
00000090   61 61 61 61 61 61 61 61  61 61 61 61 61 61 61 61   aaaaaaaaaaaaaaaa
000000A0   61 61 61 61 61 61 61 61  61 61 61 61 61 61 61 61   aaaaaaaaaaaaaaaa
000000B0   61 61 61 61 61 61 61 61  61 61 61 61 61 61 61 61   aaaaaaaaaaaaaaaa
000000C0   61 61 61 61 61 61 61 61  61                        aaaaaaaaa  

服务器接收的Protobuf转储:

Protobuf dump which server receives:

0000: 0A EF BF BD 01 61 61 61 61 61 61 61 61 61 61 61   .....aaaaaaaaaaa
0010: 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61   aaaaaaaaaaaaaaaa
0020: 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61   aaaaaaaaaaaaaaaa
0030: 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61   aaaaaaaaaaaaaaaa
0040: 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61   aaaaaaaaaaaaaaaa
0050: 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61   aaaaaaaaaaaaaaaa
0060: 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61   aaaaaaaaaaaaaaaa
0070: 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61   aaaaaaaaaaaaaaaa
0080: 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61   aaaaaaaaaaaaaaaa
0090: 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61   aaaaaaaaaaaaaaaa
00A0: 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61   aaaaaaaaaaaaaaaa
00B0: 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61   aaaaaaaaaaaaaaaa
00C0: 61 61 61 61 61 61 61 61 61 61 61                   aaaaaaaaaaa

正如你所看到的,protobuf 数据头是不同的......这完全让我心碎,这怎么会发生?

As you can see, the protobuf data headers are different... Thats totally breaking my mind, how could that happens?

UPDATE2:我们做了研究,发现这个问题只发生在长度超过 128 个符号的字符串中.如果字符串由 128 个或更少的符号组成 - 没有问题.

UPDATE2: we made a research, and found that this problem happens only with strings longer than 128 symbols. If string consist from 128 symbols, or lesser - there is no problem.

推荐答案

其中 10 是 protobuf 字段编号,

Where 10 is protobuf field number,

是的;字段 1,长度前缀.

Yes; field 1, length-prefixed.

而198是字符串长度,1好像是字符串标识符,还是什么?

and 198 is string length, and 1 seems to be like string identifier, or what?

198 1 是字符串长度,用varint"编码;这计算为整数 198,但需要两个字节来编码.

The 198 1 is the string length, encoded with "varint" encoding; this computes as the integer 198, but takes two bytes to encode.

为什么 protobuf 无法解析此消息?

And why protobuf cannot parse this message?

我们需要查看其余的字节;如果您没有所有字节,该库可能非常正确.您是否拥有失败案例的所有字节,可能是十六进制或 base-64?

We'd need to see the rest of the bytes; the library could be very correct if you don't have all the bytes. Do you have all the bytes for the failing case, perhaps as hex or base-64?

这篇关于带有一些字符串的 Protobuf InvalidProtocolBufferException的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆