确定.NET类型的序列化大小和非托管内存效率 [英] Determining the serialized size of a .NET type and unmanaged memory efficiency

查看:114
本文介绍了确定.NET类型的序列化大小和非托管内存效率的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我的问题是是否可以确定引用类型的序列化大小(以字节为单位).

My question is whether it is possible to determine the serialized size (in bytes) of a reference type.

在这里情况:

我正在使用BinaryFormatter类来序列化基本.NET类型,例如:

I am using the BinaryFormatter class to serialize basic .NET types, ie for instance:

[Serializable]
public class Foo
{
    public string Foo1 { get; set; }
    public string Foo2 { get; set; } 
}

我将每个项目序列化为一个byte [],然后将该段添加到现有byte []的末尾,并另外在每个段的末尾添加一个回车符以界定对象.

I am serializing each item to a byte[], then adding that segment to the end of an existing byte[] and additionally adding a carriage return at the end of each segment to delimit the objects.

为了反序列化,我使用Marshal.ReadByte()如下:

In order to deserialize I use Marshal.ReadByte() as follows:

List<byte> buffer = new List<byte>();

for (int i = 0; i < MapSize; i++)
{
    byte b = Marshal.ReadByte(readPtr , i); 

    if (b != delim)  // read until encounter a carriage return 
        buffer.Add(b);
    else
        break;
}

readPtr = readPtr + buffer.Count + 1; // incrementing the pointer for the next object

return buffer.ToArray(); 

我相信使用Marshal.Copy()会更有效,但是我需要提前知道序列化字节段的长度.有没有一种方法可以从要序列化的类型中可靠地计算出该值,或者可以使用整体上更有效的方法?

I believe that using Marshal.Copy() would be more efficient but I need to know the length of the serialized byte segment in advance. Is there a way I can reliably compute this from the type thats being serialized, or an overall more efficient method I can use?

最终,使用回车符也不可靠.因此,我想知道是否存在通过自定义BinaryFormatter或使用其他一些标准化最佳实践来划定对象的更标准方法?例如,如果BinaryFormatter序列化为通用List<>?

Also, the use of a carriage return won't be reliable, ultimately. So I am wondering if there is a more standard way to delimit the objects, either through customizing my BinaryFormatter or using some other standardized best practice? For instance is there a specific way that the BinaryFormatter delimits objects if its serializing say, a generic List<>?

推荐答案

没有一种非常好的方法可以预先确定序列化的长度. BinaryFormatter协议的规范可在此处获得: http://msdn.microsoft.com/zh-我们/library/cc236844(v=prot.10).aspx

There isn't a terribly good way to determine the serialized length beforehand. The specification for the BinaryFormatter protocol is available here: http://msdn.microsoft.com/en-us/library/cc236844(v=prot.10).aspx

为了您的目的,我为您省去了阅读麻烦:

I'll save you the trouble of reading it for your purposes:

  1. 它被构建为可扩展格式.这使您可以在以后添加字段,并且仍然与早期实现保持某种兼容性.为了您的目的,这意味着序列化表格的长度不是及时固定的.
  2. 这是非常脆弱的.二进制格式实际上对其中的字段名称进行编码.如果您曾经重命名字段,则序列化表格的长度将改变.
  3. 二进制格式实际上包含序列化编码和对象数据之间的多对一关系.可以用多种不同的方式对同一个对象进行编码,并为输出提供许多不同的字节数(我不明白为什么用这种方式编写).

如果您想要一种简单的操作方法,只需创建一个包含所有对象的数组并序列化该单个数组即可.这样可以解决您的大多数问题.分隔不同对象的所有问题均由BinaryFormatter处理.您不会有过多的内存复制.最终输出将更加紧凑,因为BinaryFormatter每次调用只需指定一次字段名称.

If you want an easy way to do things, just create an array that contains all the objects and serialize that single array. This solves most of your problems. All the issues of delimiting the different objects are handled by the BinaryFormatter. You won't have excessive memory copying. The final output will be more compact because the BinaryFormatter only has to specify the field names once per invocation.

最后,我可以告诉您,额外的内存副本并不是当前实现中效率低下的主要根源. BinaryFormatter对反射的使用以及它在序列化输出中对字段名称进行编码的事实,使您的效率大大降低.

Finally, I can tell you that the extra memory copy is not the main source of inefficiency in your current implementation. You're getting far more inefficiency from the BinaryFormatter's use of reflection, and the fact that it encodes the field names in the serialized output.

如果效率至高无上,那么我建议编写一些自定义代码,以普通旧数据"格式对结构的内容进行编码.然后,您可以控制写入的数量和方式.

If efficiency is paramount, then I would suggest writing some custom code that encodes the contents of your structures in "plain old data" format. Then you'll have control over how much gets written and how.

这篇关于确定.NET类型的序列化大小和非托管内存效率的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆