protobuf-net 是否具有用于序列化的内置压缩? [英] Does protobuf-net have built-in compression for serialization?

查看:42
本文介绍了protobuf-net 是否具有用于序列化的内置压缩?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在 BinaryFormatter 和 protobuf-net 序列化程序之间做了一些比较,对我found,但奇怪的是 protobuf-net 设法将对象序列化为一个比我只写将每个属性的值转换为一个没有任何元数据的字节数组.

I was doing some comparison between BinaryFormatter and protobuf-net serializer and was quite pleased with what I found, but what was strange is that protobuf-net managed to serialize the objects into a smaller byte array than what I would get if I just wrote the value of every property into an array of bytes without any metadata.

我知道如果您将 AsReference 设置为 true,protobuf-net 支持字符串实习,但在这种情况下我不会这样做,protobuf-net 也提供了一些默认压缩?

I know protobuf-net supports string interning if you set AsReference to true, but I'm not doing that in this case, so does protobuf-net provide some compression by default?

以下是一些您可以自己运行的代码:

Here's some code you can run to see for yourself:

var simpleObject = new SimpleObject
                       {
                           Id = 10,
                           Name = "Yan",
                           Address = "Planet Earth",
                           Scores = Enumerable.Range(1, 10).ToList()
                       };

using (var memStream = new MemoryStream())
{
    var binaryWriter = new BinaryWriter(memStream);
    // 4 bytes for int
    binaryWriter.Write(simpleObject.Id);      
    // 3 bytes + 1 more for string termination
    binaryWriter.Write(simpleObject.Name);    
    // 12  bytes + 1 more for string termination
    binaryWriter.Write(simpleObject.Address); 
    // 40 bytes for 10 ints
    simpleObject.Scores.ForEach(binaryWriter.Write); 

    // 61 bytes, which is what I expect
    Console.WriteLine("BinaryWriter wrote [{0}] bytes",
      memStream.ToArray().Count());
}

using (var memStream = new MemoryStream())
{
    ProtoBuf.Serializer.Serialize(memStream, simpleObject);

    // 41 bytes!
    Console.WriteLine("Protobuf serialize wrote [{0}] bytes",
      memStream.ToArray().Count());
}

忘记添加了,SimpleObject 类看起来像这样:

forgot to add, the SimpleObject class looks like this:

[Serializable]
[DataContract]
public class SimpleObject
{
    [DataMember(Order = 1)]
    public int Id { get; set; }

    [DataMember(Order = 2)]
    public string Name { get; set; }

    [DataMember(Order = 3)]
    public string Address { get; set; }

    [DataMember(Order = 4)]
    public List<int> Scores { get; set; }
}

推荐答案

不,它没有;protobuf 规范中没有指定压缩";然而,它(默认情况下)使用varint encoding"——一种用于整数数据的可变长度编码,这意味着小值使用更少的空间;所以 0-127 取 1 个字节加上头.请注意,varint 本身对于负数来说非常复杂,因此还支持zigzag"编码,允许小幅度变小(基本上,它交错正负对).

No it does not; there is no "compression" as such specified in the protobuf spec; however, it does (by default) use "varint encoding" - a variable-length encoding for integer data that means small values use less space; so 0-127 take 1 byte plus the header. Note that varint by itself goes pretty loopy for negative numbers, so "zigzag" encoding is also supported which allows small magnitude numbers to be small (basically, it interleaves positive and negative pairs).

实际上,对于 Scores,您还应该查看打包"编码,它需要 [ProtoMember(4, IsPacked = true)] 或等效的通过 v2 中的 TypeModel(v2 支持任一方法).这通过编写单个标头和组合长度来避免每个值的标头开销.打包"可以与 varint/zigzag 一起使用.对于您知道值可能很大且不可预测的情况,还有固定长度编码.

Actually, in your case for Scores you should also look at "packed" encoding, which requires either [ProtoMember(4, IsPacked = true)] or the equivalent via TypeModel in v2 (v2 supports either approach). This avoids the overhead of a header per value, by writing a single header and the combined length. "Packed" can be used with varint/zigzag. There are also fixed-length encodings for scenarios where you know the values are likely large and unpredictable.

另请注意:但如果您的数据有大量文本,您可能会通过 gzip 或 deflate 额外运行它而受益;如果它没有,那么 gzip 和 deflate 都可能导致它变大.

Note also: but if your data has lots of text you may benefit from additionally running it through gzip or deflate; if it doesn't, then both gzip and deflate could cause it to get bigger.

在线格式概述在这里;理解起来不是很棘手,并且可以帮助您计划如何最好地进一步优化.

An overview of the wire format is here; it isn't very tricky to understand, and may help you plan how best to further optimize.

这篇关于protobuf-net 是否具有用于序列化的内置压缩?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆