FileHelpers在解析大型csv文件时抛出OutOfMemoryException [英] FileHelpers throws OutOfMemoryException when parsing large csv file
问题描述
我尝试使用FileHelpers解析非常大的csv文件( http://www.filehelpers.net/ )。该文件为1GB压缩和大约20GB解压缩。
string fileName = @c:\myfile.csv.gz
using(var fileStream = File.OpenRead(fileName))
{
使用(GZipStream gzipStream = new GZipStream(fileStream,CompressionMode.Decompress,false))
{
使用(TextReader textReader = new StreamReader(gzipStream))
{
var engine = new FileHelperEngine< CSVItem>();
CSVItem [] items = engine.ReadStream(textReader); FileHelpers然后抛出一个OutOfMemoryException异常,这是一个非常简单的例子。
测试失败:类型System.OutOfMemoryException的异常为
。 System.OutOfMemoryException:抛出类型
'System.OutOfMemoryException'的异常。
System.Text.StringBuilder.ExpandByABlock(Int32 minBlockCharCount)at
System.Text.StringBuilder.Append(Char value,Int32 repeatCount)at
System.Text.StringBuilder.Append(Char value )at
FileHelpers.StringHelper.ExtractQuotedString(LineInfo line,Char
quoteChar,Boolean allowMultiline)at
FileHelpers.DelimitedField.ExtractFieldString(LineInfo line)at
FileHelpers.FieldBase.ExtractValue LineInfo行)
FileHelpers.RecordInfo.StringToRecord(LineInfo line)at
FileHelpers.FileHelperEngine 1.ReadStream(TextReader reader,Int32
maxRecords,DataTable dt)at
FileHelpers.FileHelperEngine
1.ReadStream(TextReader reader)
可以解析文件这个大与FileHelpers?如果没有人可以推荐一种方法来解析文件这个大?
解决方案您必须以记录方式以这种方式工作记录:
string fileName = @c:\myfile.csv.gz;
using(var fileStream = File.OpenRead(fileName))
{
using(GZipStream gzipStream = new GZipStream(fileStream,CompressionMode.Decompress,false))
{
使用(TextReader textReader = new StreamReader(gzipStream))
{
var engine = new FileHelperAsyncEngine< CSVItem>();
using(engine.BeginReadStream(textReader))
{
foreach(引擎中的var记录)
{
//处理每个项目
}
}
}
}
}
使用这个异步aproach你将只使用一个记录的内存一段时间,这将是更快得多。
I'm trying to parse a very large csv file with FileHelpers (http://www.filehelpers.net/). The file is 1GB zipped and about 20GB unzipped.
string fileName = @"c:\myfile.csv.gz";
using (var fileStream = File.OpenRead(fileName))
{
using (GZipStream gzipStream = new GZipStream(fileStream, CompressionMode.Decompress, false))
{
using (TextReader textReader = new StreamReader(gzipStream))
{
var engine = new FileHelperEngine<CSVItem>();
CSVItem[] items = engine.ReadStream(textReader);
}
}
}
FileHelpers then throws an OutOfMemoryException.
Test failed: Exception of type 'System.OutOfMemoryException' was
thrown. System.OutOfMemoryException: Exception of type
'System.OutOfMemoryException' was thrown. at
System.Text.StringBuilder.ExpandByABlock(Int32 minBlockCharCount) at
System.Text.StringBuilder.Append(Char value, Int32 repeatCount) at
System.Text.StringBuilder.Append(Char value) at
FileHelpers.StringHelper.ExtractQuotedString(LineInfo line, Char
quoteChar, Boolean allowMultiline) at
FileHelpers.DelimitedField.ExtractFieldString(LineInfo line) at
FileHelpers.FieldBase.ExtractValue(LineInfo line) at
FileHelpers.RecordInfo.StringToRecord(LineInfo line) at
FileHelpers.FileHelperEngine1.ReadStream(TextReader reader, Int32
maxRecords, DataTable dt) at
FileHelpers.FileHelperEngine
1.ReadStream(TextReader reader)
Is it possible to parse a file this big with FileHelpers? If not can anyone recommend an approach to parsing files this big? Thanks.
解决方案 You must work record by record in this way:
string fileName = @"c:\myfile.csv.gz";
using (var fileStream = File.OpenRead(fileName))
{
using (GZipStream gzipStream = new GZipStream(fileStream, CompressionMode.Decompress, false))
{
using (TextReader textReader = new StreamReader(gzipStream))
{
var engine = new FileHelperAsyncEngine<CSVItem>();
using(engine.BeginReadStream(textReader))
{
foreach(var record in engine)
{
// Work with each item
}
}
}
}
}
If you use this async aproach you will only be using the memory for a record a time, and that will be much more faster.
这篇关于FileHelpers在解析大型csv文件时抛出OutOfMemoryException的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!