C# - 优化二进制序列化的多维数组的通用 [英] C# - Optimising binary serialization for multi-dimensional generic arrays
问题描述
我有我需要的二进制序列化的类。该类包含如下一个字段:
I have a class that I need to binary serialize. The class contains one field as below:
private T[,] m_data;
这些多维数组可以是相当大的(数十万元素)和任何基本类型。当我试图在对象上标准的.NET序列化写入到磁盘上的文件很大,我认为.NET是存储了很多元素类型的重复数据,并可能不是有效的可以做。
These multi-dimensional arrays can be fairly large (hundreds of thousands of elements) and of any primitive type. When I tried standard .net serialization on an object the file written to disk was large and I think .net is storing a lot of repeated data about element types and possibly not as efficiently as could be done.
我环顾四周的定制序列,但还没有看到任何处理多维通用阵列。我还尝试用内置.NET COM pression在与一些成功的序列化的存储流的字节数组,而不是作为快速/ COM pressed我所希望的。
I have looked around for custom serializers but have not seen any that deal with multi-dimensional generic arrays. I have also experimented with built-in .net compression on a byte array of the memory stream following serializing with some success, but not as quick / compressed as I had hoped.
我的问题是,我应该尝试编写自定义序列化优化序列化数组为适当的类型(这似乎有点令人生畏),或者我应该使用标准的.NET序列化和添加COM pression?
My question is, should I try and write a custom serializer to optimally serialize this array for the appropriate type (this seems a little daunting), or should I use standard .net serialization and add compression?
这是最好的方法任何意见将是最AP preciated,或资源链接,展示了如何解决多维数组的通用序列化 - 提到的现有的范例我发现不支持这样的结构。
Any advice on the best approach would be most appreciated, or links to resources showing how to tackle serialization of a multi-dimensional generic array - as mentioned existing examples I have found do not support such structures.
推荐答案
下面是我想出了。下面的code,使一个int [1000] [10000],并使用BinaryFormatter的2档写入了 - 一个压缩,一个没有。
Here's what I came up with. The code below makes an int[1000][10000] and writes it out using the BinaryFormatter to 2 files - one zipped and one not.
压缩文件是1.19 MB(1255339字节) 解压后是38.2 MB(40150034字节)
The zipped file is 1.19 MB (1,255,339 bytes) Unzipped is 38.2 MB (40,150,034 bytes)
int width = 1000;
int height = 10000;
List<int[]> list = new List<int[]>();
for (int i = 0; i < height; i++)
{
list.Add(Enumerable.Range(0, width).ToArray());
}
int[][] bazillionInts = list.ToArray();
using (FileStream fsZ = new FileStream("c:\\temp_zipped.txt", FileMode.Create))
using (FileStream fs = new FileStream("c:\\temp_notZipped.txt", FileMode.Create))
using (GZipStream gz = new GZipStream(fsZ, CompressionMode.Compress))
{
BinaryFormatter f = new BinaryFormatter();
f.Serialize(gz, bazillionInts);
f.Serialize(fs, bazillionInts);
}
我想不出更好的/简单的方法来做到这一点。压缩的版本是pretty的该死的紧张。
I can't think of a better/easy way to do this. The zipped version is pretty damn tight.
我会去用的BinaryFormatter + GZipStream。制作自定义的东西不会是乐趣可言。
I'd go with the BinaryFormatter + GZipStream. Making something custom would not be fun at all.
[由MG编辑] 我希望你不要被编辑跌倒,但反复范围(0,宽)统一是扭曲的东西大大;变化:
[edit by MG] I hope you won't be offended by an edit, but the uniform repeated Range(0,width) is skewing things vastly; change to:
int width = 1000;
int height = 10000;
Random rand = new Random(123456);
int[,] bazillionInts = new int[width, height];
for(int i = 0 ; i < width;i++)
for (int j = 0; j < height; j++)
{
bazillionInts[i, j] = rand.Next(50000);
}
和尝试;你会看到 temp_notZipped.txt
在40MB, temp_zipped.txt
为62MB。不那么吸引人......
And try it; you'll see temp_notZipped.txt
at 40MB, temp_zipped.txt
at 62MB. Not so appealing...
这篇关于C# - 优化二进制序列化的多维数组的通用的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!