多个读取器更快地读取文件 [英] Read file faster by multiple readers

查看:87
本文介绍了多个读取器更快地读取文件的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

所以我有一个大文件,大约有200万行.文件读取是我代码中的瓶颈.欢迎提供任何建议的方式和专家意见来更快地读取文件.从该文本文件读取行的顺序并不重要. 所有行都是管道"|"分隔的固定长度记录.

So i have a large file which has ~2 million lines. The file reading is a bottleneck in my code. Any suggested ways and expert opinion to read the file faster is welcome. Order of reading lines from that text file is unimportant. All lines are pipe '|' separated fixed length records.

我尝试了什么?我开始并行执行StreamReader并确保资源被正确锁定,但是这种方法失败了,因为我现在有多个线程在争夺单个StreamReader的争夺,并且浪费了更多的锁定时间,从而使代码进一步变慢.

What i tried? I started parallel StreamReaders and made sure that resource is locked properly but this approach failed as i now had multiple threads fighting to get hold of the single StreamReader and wasting more time in locking etc thereby making the code slow down further.

一种直观的方法是先破碎文件然后再读取,但我希望保持文件原样,并且仍然能够以某种方式更快地读取文件.

One intuitive approach is to break the file and then read it, but i wish to leave the file intact and still be somehow able to read it faster.

推荐答案

我会尝试最大化缓冲区大小.默认大小为1024,增加此大小应可提高性能.我建议尝试其他缓冲区大小选项.

I would try maximizing my buffer size. The default size is 1024, increasing this should increase performance. I would suggest trying other buffer size options.

StreamReader(Stream,Encoding,Boolean,Int32)初始化一个新的 指定流的StreamReader类的实例,其中 指定的字符编码,字节顺序标记检测选项以及 缓冲区大小.

StreamReader(Stream, Encoding, Boolean, Int32) Initializes a new instance of the StreamReader class for the specified stream, with the specified character encoding, byte order mark detection option, and buffer size.

这篇关于多个读取器更快地读取文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆