结构快速序列化/反序列化 [英] Fast serialization/deserialization of structs

查看:118
本文介绍了结构快速序列化/反序列化的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在仅由结构简单的对象结构表示地理数据的巨大入账金额。我所有的字段是值类型的

I have huge amont of geographic data represented in simple object structure consisting only structs. All of my fields are of value type.

public struct Child
{
   readonly float X;
   readonly float Y;
   readonly int myField;
}

public struct Parent
{
   readonly int id;
   readonly int field1;
   readonly int field2;
   readonly Child[] children;
}



中的数据很好地分块到的一小部分父[] -s。每个阵列包含几千父实例。我有太多的数据让所有在内存中,所以我需要这些块交换到磁盘来回。 (一个文件将导致约2-300KB)。

The data is chunked up nicely to small portions of Parent[]-s. Each array contains a few thousands Parent instances. I have way too much data to keep all in memory, so I need to swap these chunks to disk back and forth. (One file would result approx. 2-300KB).

什么是序列化/反序列化父[] 字节[] 为dumpint到磁盘和读回?关于速度的我特别感兴趣的快反序列化,写入速度不是关键的。

What would be the most efficient way of serializing/deserializing the Parent[] to a byte[] for dumpint to disk and reading back? Concerning speed, I am particularly interested in fast deserialization, write speed is not that critical.

请问简单的 BinarySerializer 不够好?
或者我应该使用 StructLayout (见接受的答案)?我不知道这将与 Parent.children 阵列领域的工作。

Would simple BinarySerializer good enough? Or should I hack around with StructLayout (see accepted answer)? I am not sure if that would work with array field of Parent.children.

更​​新:回应评论 - 是的,对象是不可改变的(代码更新),并确孩子字段是不是值类型。 300KB听起来不是很多,但我有一个这样的文件不计其数,所以速度很重要。

UPDATE: Response to comments - Yes, the objects are immutable (code updated) and indeed the children field is not value type. 300KB sounds not much but I have zillions of files like that, so speed does matter.

推荐答案

BinarySerializer是一个非常普遍的序列化。它不会执行,以及一个定制的实现。

BinarySerializer is a very general serializer. It will not perform as well as a custom implementation.

幸运的是你的,你的数据仅是结构的。这意味着你将能够修复structlayout儿童和刚刚位复制使用不安全的代码从一个字节你从磁盘中读取[]孩子阵列。

Fortunately for your, your data consists of structs only. This means that you will be able to fix a structlayout for Child and just bit-copy the children array using unsafe code from a byte[] you have read from disk.

对于父母是不那么容易,因为你需要分别对待孩子。我建议你​​使用不安全的代码从[]您阅读并分别反序列化儿童字节复制位拷贝的领域。

For the parents it is not that easy because you need to treat the children separately. I recommend you use unsafe code to copy the bit-copyable fields from the byte[] you read and deserialize the children separately.

你有没有考虑所有的孩子都映射到内存使用内存映射文件?你可以再重新使用操作系统的缓存设备,而不是处理阅读和写作在所有

Did you consider mapping all the children into memory using memory mapped files? You could then re-use the operating systems cache facility and not deal with reading and writing at all.

零拷贝反序列化儿童[]看起来像这样:

Zero-copy-deserializing a Child[] looks like this:

byte[] bytes = GetFromDisk();
fixed (byte* bytePtr = bytes) {
 Child* childPtr = (Child*)bytePtr;
 //now treat the childPtr as an array:
 var x123 = childPtr[123].X;

 //if we need a real array that can be passed around, we need to copy:
 var childArray = new Child[GetLengthOfDeserializedData()];
 for (i = [0..length]) {
  childArray[i] = childPtr[i];
 }
}

这篇关于结构快速序列化/反序列化的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆