为什么BufferedReader的性能比BufferedInputStream差很多? [英] Why is the performance of BufferedReader so much worse than BufferedInputStream?
问题描述
我知道使用BufferedReader(包装FileReader)比使用BufferedInputStream(包装FileInputStream)要慢得多,因为必须将原始字节转换为字符。但是我不明白为什么它这么慢!这是我正在使用的两个代码示例:
BufferedInputStream inputStream = new BufferedInputStream(new FileInputStream(filename));
try {
byte [] byteBuffer = new byte [bufferSize];
int numberOfBytes;
do {
numberOfBytes = inputStream.read(byteBuffer,0,bufferSize);
} while(numberOfBytes> = 0);
}
最后{
inputStream.close();
}
和:
BufferedReader reader = new BufferedReader(新的FileReader(文件名),bufferSize);
try {
char [] charBuffer = new char [bufferSize];
int numberOfChars;
do {
numberOfChars = reader.read(charBuffer,0,bufferSize);
} while(numberOfChars> = 0);
}
最后{
reader.close();
}
我尝试了使用各种缓冲区大小的测试,所有缓冲区大小均为150 MB 。结果如下(缓冲区大小以字节为单位;时间以毫秒为单位):
缓冲区输入
$ p可以看出,BufferedInputStream的最快时间(64毫秒)比BufferedReader的最快时间(465毫秒)快七倍。如上所述,我并没有明显的不同之处。
大小流读取器
4,096145497
8,192125465
16,384 95515
32,768 74506
65,536 64531
我的问题是:是否有人对如何提高BufferedReader的性能或其他机制提出建议?
解决方案BufferedReader已将字节转换为char。相对于数据块的直接复制,这种逐字节的解析和复制到较大的类型是昂贵的。
byte []字节=新字节[150 * 1024 * 1024];
Arrays.fill(bytes,(byte)‘\n’);
for(int i = 0; i< 10; i ++){
long start = System.nanoTime();
StandardCharsets.UTF_8.decode(ByteBuffer.wrap(bytes));
long time = System.nanoTime()-开始;
System.out.printf(解码%,d MB的时间为%,d ms%n,
bytes.length / 1024/1024,时间/ 1000000);
}
打印
解码150 MB的时间为226毫秒
解码150 MB的时间为167毫秒
注意:将此操作与系统调用混合使用会降低两种操作的速度(因为系统调用会干扰缓存)
I understand that using a BufferedReader (wrapping a FileReader) is going to be significantly slower than using a BufferedInputStream (wrapping a FileInputStream), because the raw bytes have to be converted to characters. But I don't understand why it is so much slower! Here are the two code samples that I'm using:
BufferedInputStream inputStream = new BufferedInputStream(new FileInputStream(filename)); try { byte[] byteBuffer = new byte[bufferSize]; int numberOfBytes; do { numberOfBytes = inputStream.read(byteBuffer, 0, bufferSize); } while (numberOfBytes >= 0); } finally { inputStream.close(); }
and:
BufferedReader reader = new BufferedReader(new FileReader(filename), bufferSize); try { char[] charBuffer = new char[bufferSize]; int numberOfChars; do { numberOfChars = reader.read(charBuffer, 0, bufferSize); } while (numberOfChars >= 0); } finally { reader.close(); }
I've tried tests using various buffer sizes, all with a 150 megabyte file. Here are the results (buffer size is in bytes; times are in milliseconds):
Buffer Input Size Stream Reader 4,096 145 497 8,192 125 465 16,384 95 515 32,768 74 506 65,536 64 531
As can be seen, the fastest time for the BufferedInputStream (64 ms) is seven times faster than the fastest time for the BufferedReader (465 ms). As I stated above, I don't have an issue with a significant difference; but this much difference just seems unreasonable.
My question is: does anyone have a suggestion for how to improve the performance of the BufferedReader, or an alternative mechanism?
解决方案The BufferedReader has convert the bytes into chars. This byte by byte parsing and copy to a larger type is expensive relative to a straight copy of blocks of data.
byte[] bytes = new byte[150 * 1024 * 1024]; Arrays.fill(bytes, (byte) '\n'); for (int i = 0; i < 10; i++) { long start = System.nanoTime(); StandardCharsets.UTF_8.decode(ByteBuffer.wrap(bytes)); long time = System.nanoTime() - start; System.out.printf("Time to decode %,d MB was %,d ms%n", bytes.length / 1024 / 1024, time / 1000000); }
prints
Time to decode 150 MB was 226 ms Time to decode 150 MB was 167 ms
NOTE: Having to do this intermixed with system calls can slow down both operations (as system calls can disturb the cache)
这篇关于为什么BufferedReader的性能比BufferedInputStream差很多?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!