Google Protocol Buffer序列化的字符串中可以包含嵌入的NULL字符吗? [英] Google Protocol Buffer serialized string can contain embedded NULL characters in it?
问题描述
我使用Google协议缓冲区进行邮件序列化。
这是我的示例原始文件内容。
package MessageParam;
消息示例
{
消息WordRec
{
可选uint64 id = 1;
optional string word = 2;
可选double value = 3;
}
消息WordSequence
{
重复WordRec WordSeq = 1;
}
}
我试图在C ++中序列化消息,如下
MessageParam :: Sample :: WordSequence wordseq;
for(int i = 0; i <10; i ++)
{
AddRecords(wordseq.add_wordseq());
}
std :: string str = wordseq.SerializeAsString();
执行上述语句后,str的大小为430.它有嵌入的空字符它。当我试图将这个str分配给std :: wstring时,std :: wstring会在找到第一个空字符时终止。
void AddRecords(MessageParam :: Sample :: WordRec * wordrec)
{
int id;
cin>> id;
wordrec-> set_id(id);
getline(cin,* wordrec-> mutable_word());
long value;
cin>> value;
wordrec-> set_value(value);
}
wordseq.DebugString()的值为
WordSeq {
id:4
word:software
value:1
}
WordSeq {
id:19
word:technical
value:0.70992374420166016
}
WordSeq {
id:51
word:hardware
value:0.626017153263092
} >
如何将wordseq序列化为包含嵌入NULL字符的字符串?
将Protobuf存储在 wstring
中。 wstring
是用于存储unicode文本,但protobuf不是unicode文本或任何其他种类的文本,它是原始字节。你应该保持在字节形式。
可以证明Protobufs使用 std ::如果你真的需要在文本上下文中存储Protobuf, string
到存储字节(而不是文本)是混乱的。也许应该一直使用 std :: vector< unsigned char>
。你应该像对待 std :: vector< unsigned char>
一样处理protobuf的 std :: string
p>
I am using Google Protocol Buffer for message serialization. This is my sample proto file content.
package MessageParam;
message Sample
{
message WordRec
{
optional uint64 id = 1;
optional string word = 2;
optional double value = 3;
}
message WordSequence
{
repeated WordRec WordSeq = 1;
}
}
I am trying to serialize the message in C++ like following
MessageParam::Sample::WordSequence wordseq;
for(int i =0;i<10;i++)
{
AddRecords(wordseq.add_wordseq());
}
std::string str = wordseq.SerializeAsString();
After executing the above statement, the size of the str is 430. It is having embedded null characters in it. While I am trying to assign this str to std::wstring, std::wstring is terminating when it finds first null character.
void AddRecords(MessageParam::Sample::WordRec* wordrec)
{
int id;
cin>>id;
wordrec->set_id(id);
getline(cin, *wordrec->mutable_word());
long value;
cin>>value;
wordrec->set_value(value);
}
Value of wordseq.DebugString() is WordSeq { id: 4 word: "software" value: 1 } WordSeq { id: 19 word: "technical" value: 0.70992374420166016 } WordSeq { id: 51 word: "hardware" value: 0.626017153263092 } How can I serialize "wordseq" as string which contains embedded NULL characters ?
You should not try to store a Protobuf in a wstring
. wstring
is for storing unicode text, but a protobuf is not unicode text nor any other kind of text, it is raw bytes. You should keep in in byte form. If you really need to store a Protobuf in a textual context, you should base64-encode it first.
Arguably Protobufs' use of std::string
to store bytes (rather than text) is confusing. Perhaps it should have used std::vector<unsigned char>
all along. You should treat protobufs' std::string
s like you would std::vector<unsigned char>
.
这篇关于Google Protocol Buffer序列化的字符串中可以包含嵌入的NULL字符吗?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!