为什么BufferedReader的性能比BufferedInputStream差很多? [英] Why is the performance of BufferedReader so much worse than BufferedInputStream?

查看:228
本文介绍了为什么BufferedReader的性能比BufferedInputStream差很多?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我知道使用BufferedReader(包装FileReader)比使用BufferedInputStream(包装FileInputStream)要慢得多,因为必须将原始字节转换为字符。但是我不明白为什么它这么慢!这是我正在使用的两个代码示例:

  BufferedInputStream inputStream = new BufferedInputStream(new FileInputStream(filename)); 
try {
byte [] byteBuffer = new byte [bufferSize];
int numberOfBytes;
do {
numberOfBytes = inputStream.read(byteBuffer,0,bufferSize);
} while(numberOfBytes> = 0);
}
最后{
inputStream.close();
}

和:

  BufferedReader reader = new BufferedReader(新的FileReader(文件名),bufferSize); 
try {
char [] charBuffer = new char [bufferSize];
int numberOfChars;
do {
numberOfChars = reader.read(charBuffer,0,bufferSize);
} while(numberOfChars> = 0);
}
最后{
reader.close();
}

我尝试了使用各种缓冲区大小的测试,所有缓冲区大小均为150 MB 。结果如下(缓冲区大小以字节为单位;时间以毫秒为单位):

 缓冲区输入
大小流读取器
4,096145497
8,192125465
16,384 95515
32,768 74506
65,536 64531


我的问题是:是否有人对如何提高BufferedReader的性能或其他机制提出建议?

解决方案

BufferedReader已将字节转换为char。相对于数据块的直接复制,这种逐字节的解析和复制到较大的类型是昂贵的。

  byte []字节=新字节[150 * 1024 * 1024]; 
Arrays.fill(bytes,(byte)‘\n’);

for(int i = 0; i< 10; i ++){
long start = System.nanoTime();
StandardCharsets.UTF_8.decode(ByteBuffer.wrap(bytes));
long time = System.nanoTime()-开始;
System.out.printf(解码%,d MB的时间为%,d ms%n,
bytes.length / 1024/1024,时间/ 1000000);
}

打印

 解码150 MB的时间为226毫秒
解码150 MB的时间为167毫秒

注意:将此操作与系统调用混合使用会降低两种操作的速度(因为系统调用会干扰缓存)


I understand that using a BufferedReader (wrapping a FileReader) is going to be significantly slower than using a BufferedInputStream (wrapping a FileInputStream), because the raw bytes have to be converted to characters. But I don't understand why it is so much slower! Here are the two code samples that I'm using:

BufferedInputStream inputStream = new BufferedInputStream(new FileInputStream(filename));
try {
  byte[] byteBuffer = new byte[bufferSize];
  int numberOfBytes;
  do {
    numberOfBytes = inputStream.read(byteBuffer, 0, bufferSize);
  } while (numberOfBytes >= 0);
}
finally {
  inputStream.close();
}

and:

BufferedReader reader = new BufferedReader(new FileReader(filename), bufferSize);
try {
  char[] charBuffer = new char[bufferSize];
  int numberOfChars;
  do {
    numberOfChars = reader.read(charBuffer, 0, bufferSize);
  } while (numberOfChars >= 0);
}
finally {
  reader.close();
}

I've tried tests using various buffer sizes, all with a 150 megabyte file. Here are the results (buffer size is in bytes; times are in milliseconds):

Buffer   Input
  Size  Stream  Reader
 4,096    145     497
 8,192    125     465
16,384     95     515
32,768     74     506
65,536     64     531

As can be seen, the fastest time for the BufferedInputStream (64 ms) is seven times faster than the fastest time for the BufferedReader (465 ms). As I stated above, I don't have an issue with a significant difference; but this much difference just seems unreasonable.

My question is: does anyone have a suggestion for how to improve the performance of the BufferedReader, or an alternative mechanism?

解决方案

The BufferedReader has convert the bytes into chars. This byte by byte parsing and copy to a larger type is expensive relative to a straight copy of blocks of data.

byte[] bytes = new byte[150 * 1024 * 1024];
Arrays.fill(bytes, (byte) '\n');

for (int i = 0; i < 10; i++) {
    long start = System.nanoTime();
    StandardCharsets.UTF_8.decode(ByteBuffer.wrap(bytes));
    long time = System.nanoTime() - start;
    System.out.printf("Time to decode %,d MB was %,d ms%n",
            bytes.length / 1024 / 1024, time / 1000000);
}

prints

Time to decode 150 MB was 226 ms
Time to decode 150 MB was 167 ms

NOTE: Having to do this intermixed with system calls can slow down both operations (as system calls can disturb the cache)

这篇关于为什么BufferedReader的性能比BufferedInputStream差很多?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆