多个读取器更快地读取文件 [英] Read file faster by multiple readers
问题描述
所以我有一个大文件,大约有200万行.文件读取是我代码中的瓶颈.欢迎提供任何建议的方式和专家意见来更快地读取文件.从该文本文件读取行的顺序并不重要. 所有行都是管道"|"分隔的固定长度记录.
So i have a large file which has ~2 million lines. The file reading is a bottleneck in my code. Any suggested ways and expert opinion to read the file faster is welcome. Order of reading lines from that text file is unimportant. All lines are pipe '|' separated fixed length records.
我尝试了什么?我开始并行执行StreamReader
并确保资源被正确锁定,但是这种方法失败了,因为我现在有多个线程在争夺单个StreamReader
的争夺,并且浪费了更多的锁定时间,从而使代码进一步变慢.
What i tried? I started parallel StreamReader
s and made sure that resource is locked properly but this approach failed as i now had multiple threads fighting to get hold of the single StreamReader
and wasting more time in locking etc thereby making the code slow down further.
一种直观的方法是先破碎文件然后再读取,但我希望保持文件原样,并且仍然能够以某种方式更快地读取文件.
One intuitive approach is to break the file and then read it, but i wish to leave the file intact and still be somehow able to read it faster.
推荐答案
我会尝试最大化缓冲区大小.默认大小为1024,增加此大小应可提高性能.我建议尝试其他缓冲区大小选项.
I would try maximizing my buffer size. The default size is 1024, increasing this should increase performance. I would suggest trying other buffer size options.
StreamReader(Stream,Encoding,Boolean,Int32)初始化一个新的 指定流的StreamReader类的实例,其中 指定的字符编码,字节顺序标记检测选项以及 缓冲区大小.
StreamReader(Stream, Encoding, Boolean, Int32) Initializes a new instance of the StreamReader class for the specified stream, with the specified character encoding, byte order mark detection option, and buffer size.
这篇关于多个读取器更快地读取文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!