Google Protocol Buffer序列化的字符串中可以包含嵌入的NULL字符吗? [英] Google Protocol Buffer serialized string can contain embedded NULL characters in it?

查看:709
本文介绍了Google Protocol Buffer序列化的字符串中可以包含嵌入的NULL字符吗?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我使用Google协议缓冲区进行邮件序列化。
这是我的示例原始文件内容。

  package MessageParam; 

消息示例
{
消息WordRec
{
可选uint64 id = 1;
optional string word = 2;
可选double value = 3;
}
消息WordSequence
{
重复WordRec WordSeq = 1;
}
}



我试图在C ++中序列化消息,如下

  MessageParam :: Sample :: WordSequence wordseq; 
for(int i = 0; i <10; i ++)
{
AddRecords(wordseq.add_wordseq());
}
std :: string str = wordseq.SerializeAsString();

执行上述语句后,str的大小为430.它有嵌入的空字符它。当我试图将这个str分配给std :: wstring时,std :: wstring会在找到第一个空字符时终止。

  void AddRecords(MessageParam :: Sample :: WordRec * wordrec)
{
int id;
cin>> id;
wordrec-> set_id(id);
getline(cin,* wordrec-> mutable_word());
long value;
cin>> value;
wordrec-> set_value(value);
}

wordseq.DebugString()的值为
WordSeq {
id:4
word:software
value:1
}
WordSeq {
id:19
word:technical
value:0.70992374420166016
}
WordSeq {
id:51
word:hardware
value:0.626017153263092
} >
如何将wordseq序列化为包含嵌入NULL字符的字符串?

解决方案

将Protobuf存储在 wstring 中。 wstring 是用于存储unicode文本,但protobuf不是unicode文本或任何其他种类的文本,它是原始字节。你应该保持在字节形式。



可以证明Protobufs使用 std ::如果你真的需要在文本上下文中存储Protobuf, string 到存储字节(而不是文本)是混乱的。也许应该一直使用 std :: vector< unsigned char> 。你应该像对待 std :: vector< unsigned char> 一样处理protobuf的 std :: string p>

I am using Google Protocol Buffer for message serialization. This is my sample proto file content.

package MessageParam;

message Sample
{
    message WordRec
    {
        optional uint64 id = 1; 
        optional string word = 2;
        optional double value = 3;
    }
    message WordSequence
    {
        repeated WordRec WordSeq = 1;
    }
}

I am trying to serialize the message in C++ like following

MessageParam::Sample::WordSequence wordseq;
for(int i =0;i<10;i++)
{
    AddRecords(wordseq.add_wordseq());
}
std::string str = wordseq.SerializeAsString();

After executing the above statement, the size of the str is 430. It is having embedded null characters in it. While I am trying to assign this str to std::wstring, std::wstring is terminating when it finds first null character.

void AddRecords(MessageParam::Sample::WordRec* wordrec)
{
    int id;
    cin>>id;
    wordrec->set_id(id);
    getline(cin, *wordrec->mutable_word());
    long value;
    cin>>value;
    wordrec->set_value(value);
}

Value of wordseq.DebugString() is WordSeq { id: 4 word: "software" value: 1 } WordSeq { id: 19 word: "technical" value: 0.70992374420166016 } WordSeq { id: 51 word: "hardware" value: 0.626017153263092 } How can I serialize "wordseq" as string which contains embedded NULL characters ?

解决方案

You should not try to store a Protobuf in a wstring. wstring is for storing unicode text, but a protobuf is not unicode text nor any other kind of text, it is raw bytes. You should keep in in byte form. If you really need to store a Protobuf in a textual context, you should base64-encode it first.

Arguably Protobufs' use of std::string to store bytes (rather than text) is confusing. Perhaps it should have used std::vector<unsigned char> all along. You should treat protobufs' std::strings like you would std::vector<unsigned char>.

这篇关于Google Protocol Buffer序列化的字符串中可以包含嵌入的NULL字符吗?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆