序列化/反序列化大数据集 [英] Serialize/Deserialize Large DataSet

查看:91
本文介绍了序列化/反序列化大数据集的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个报告工具,可将查询请求发送到服务器.服务器完成查询后,结果将发送回请求报告工具.使用WCF完成通信.

I have a reporting tool that sends query requests to a server. After the query is done by the server the result is sent back to the requesting reporting tool. The communication is done using WCF.

存储在DataSet对象中的查询数据非常大,通常大约100mb大.

The queried data, stored in a DataSet object, is very large and is usually round about 100mb big.

为加快传输速度,我进行了序列化(BinaryFormatter)并压缩了DataSet.服务器和报表工具之间传输的对象是字节数组.

To fasten the transmission I serialize (BinaryFormatter) and compress the DataSet.The transmitted object between the server and reporting tool is a byte array.

但是,在经过几次请求后,报表工具尝试对数据集进行反序列化时会引发OutOfMemoryException.我打电话时会抛出异常:

However after a few requests the reporting tool throws an OutOfMemoryException when it tries to deserialize the DataSet. The exception is thrown when I call:

dataSet = (DataSet) formatter.Deserialize(dstream);

dstream是DeflateStream,用于解压缩传输的压缩字节数组.

dstream is the DeflateStream used to decompress the transmitted compressed byte array.

该异常发生在formatter的子调用中.从流中创建字节数组时反序列化.

The exception occurs in a sub call of formatter.Deserialize when the byte array is created out of the stream.

还有没有其他更好的机制来防止此异常的二进制序列化方式?

Is there any other way of binary serialization that has a better mechanism to prevent this exception?

实施方式:

用于序列化和压缩数据集(由服务器使用)的方法

The method to serialize and compress the DataSet (used by the server)

public static byte[] Compress(DataSet dataSet)
{
    using (var input = new MemoryStream())
    {
        var binaryFormatter = new BinaryFormatter();
        binaryFormatter.Serialize(input, dataSet);

        using (var output = new MemoryStream())
        {
            using (var compressor = new DeflateStream(output, CompressionLevel.Optimal))
            {
                input.Position = 0;

                var buffer = new byte[1024];

                int read;

                while ((read = input.Read(buffer, 0, buffer.Length)) > 0)
                    compressor.Write(buffer, 0, read);

                compressor.Close();

                return output.ToArray();
            }
        }
    }
}

用于解压缩和反序列化数据集的方法(由报表工具使用)

The method to decompress and deserialize the DataSet (used by the reporting tool)

public static DataSet Decompress(byte[] data)
{
    DataSet dataSet;

    using (var input = new MemoryStream(data))
    {
        using (var dstream = new DeflateStream(input, CompressionMode.Decompress))
        {
            var formatter = new BinaryFormatter();
            dataSet = (DataSet) formatter.Deserialize(dstream);
        }
    }

    return dataSet;
}

堆栈跟踪:

at System.Array.InternalCreate(Void* elementType, Int32 rank, Int32* pLengths, Int32* pLowerBounds)
at System.Array.CreateInstance(Type elementType, Int32 length)
at System.Array.UnsafeCreateInstance(Type elementType, Int32 length)
at System.Runtime.Serialization.Formatters.Binary.ObjectReader.ParseArray(ParseRecord pr)
at System.Runtime.Serialization.Formatters.Binary.ObjectReader.ParseObject(ParseRecord pr)
at System.Runtime.Serialization.Formatters.Binary.ObjectReader.Parse(ParseRecord pr)
at System.Runtime.Serialization.Formatters.Binary.__BinaryParser.ReadArray(BinaryHeaderEnum binaryHeaderEnum)
at System.Runtime.Serialization.Formatters.Binary.__BinaryParser.Run()
at System.Runtime.Serialization.Formatters.Binary.ObjectReader.Deserialize(HeaderHandler handler, __BinaryParser serParser, Boolean fCheck, Boolean isCrossAppDomain, IMethodCallMessage methodCallMessage)
at System.Runtime.Serialization.Formatters.Binary.BinaryFormatter.Deserialize(Stream serializationStream, HeaderHandler handler, Boolean fCheck, Boolean isCrossAppDomain, IMethodCallMessage methodCallMessage)
at System.Runtime.Serialization.Formatters.Binary.BinaryFormatter.Deserialize(Stream serializationStream)
at DRX.PTClientMonitoring.Infrastructure.Helper.DataSetCompressor.Decompress(Byte[] data) in c:\_develop\PTClientMonitoringTool\PTClientMonitoringTool\Source\DRX.PTClientMonitoring.Infrastructure\Helper\DataSetCompressor.cs:line 51
at DRX.PTClientMonitoring.Reporting.ViewModels.ShellViewModel.<>c__DisplayClassf.<ExecudeDefinedQuery>b__4() in c:\_develop\PTClientMonitoringTool\PTClientMonitoringTool\Source\DRX.PTClientMonitoring.Reporting\ViewModels\ShellViewModel.cs:line 347

推荐答案

在序列化之前,设置:

yourDataSet.RemotingFormat = SerializationFormat.Binary;

那应该有很大帮助.即使使用BinaryFormatter 时,默认的也是xml.

That should help a lot. The default even when using BinaryFormatter is xml.

但是请注意,DataSetDataTable固有地 不是优化的最佳选择.有很多很棒的序列化工具可以更好地打包数据,但它们始终需要强大的类型模型,即List<SomeSpecificType>,其中SomeSpecificType是POCO/DTO类.甚至WCF也几乎不能容忍DataTable/DataSet.因此,如果您可以摆脱对DataTable/DataSet的依赖:我强烈建议您这样做.

Note, however, that DataSet and DataTable are inherently not great candidates for optimization. There are a lot of great serialization tools that will do a much better job of packing your data, but they invariable require strong type models, i.e. List<SomeSpecificType> where SomeSpecificType is a POCO/DTO class. Even WCF only barely tolerates DataTable/DataSet. So if you can get rid of your dependency on DataTable/DataSet: I strongly advise doing so.

另一个选择是将数据作为Stream 发送.我很确定WCF本身就支持此功能,但是从理论上讲,这将使您拥有一个实际上更大的不同的Stream( not MemoryStream).作为一种便宜的选择,您可以将临时文件用作暂存区域,但是如果可行,您可以研究将多个缓冲区缝合在一起的自定义内存流.

Another option is to send the data as a Stream. I'm pretty sure WCF supports this natively, but this would in theory allow you to have a different Stream (not MemoryStream) that is actually much larger. As a cheap option you could use a temporary file as a scratch area, but if that works you could investigate a custom in-memory stream that stitches multiple buffers together.

这篇关于序列化/反序列化大数据集的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆