Java的BufferedInputStream的幕后花絮 [英] Behind the scenes of Java's BufferedInputStream

查看:96
本文介绍了Java的BufferedInputStream的幕后花絮的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

首先,我了解缓冲作为包装的概念,例如 FileInuptStream 用作内容读取的临时容器(以读取场景为例)来自基础流,在这种情况下为 FileInputStream


  1. 说,有从流中读取100个字节(作为文件的源)。

  2. 没有缓冲,代码(<$ c $的读取方法( read 方法) c> BufferedInputStream )必须进行100次读取(一次1个字节)。

  3. 使用缓冲,取决于缓冲区大小,代码使< = 100次读取

  4. 让我们假设缓冲区大小为50。

  5. 因此,代码仅读取缓冲区(作为源)两次,以读取其中的内容。

  6. 现在,由于 FileInuptStream 是最终来源(尽管由 BufferedInputStream )的数据(包含100个字节的文件),不是要读取100次才能读取100个字节吗?虽然,代码会调用 BufferedInputStream read 方法,但是,该调用将传递给 read < FileInuptStream 的/ code>方法需要进行100次读取调用。这是我无法理解的重点。

IOW,尽管由 BufferedInputStream 包裹,但底层流(例如 FileInputStream )仍然必须一次读取一个字节。因此,缓冲的好处(不是仅需要两次读取调用即可缓冲的代码,对于应用程序的性能而言就不是)了?



谢谢。



编辑:



我将其作为后续的编辑而不是评论,因为我认为

读取 BufferedInputStream 说(摘录):


为方便起见,
尝试读取的字节数可以通过反复调用基础流的
读取方法来实现。此迭代读取将继续
,直到满足以下条件之一为止:

 已读取指定的字节数, 
基础流的读取方法返回-1,表示文件结束,即
基础流的可用方法返回零,表示进一步的输入请求将被阻塞。


我深入研究代码,发现方法调用跟踪如下:


  1. BufferedInputStream -> read(byte b [] )作为一个我想看到缓冲在起作用。

  2. BufferedInputStream -> read(byte b [],int off,int len)

  3. BufferedInputStream -> read1(byte [] b,int off,int len)-私有

  4. FileInputStream -
    读取(字节b [],int off,int len)

  5. FileInputStream -> readBytes(byte b [],int off,int len)-私有和本地。来自源代码的方法描述-




将子数组读取为字节序列。


BufferedInputStream <中调用 read1 (上面提到的#4) / code>在无限的 for 循环中。返回上面摘录的 read 方法说明中提到的条件。



正如我在OP中提到的那样(#6 ),该调用似乎由与API方法描述和方法调用跟踪相匹配的基础流处理。



如果本地API调用-<$ c $ p>
中的$ c> readBytes
FileInputStream 一次读取一个字节并创建这些字节的数组以返回?

解决方案


基础流(例如 FileInputStream )仍必须读取
一次一个字节


幸运的是,那将是非常低效的。它允许 BufferedInputStream 调用 read(byte [8192] buffer)调用 FileInputStream 会返回一大块数据。



如果您想读取单个字节(或不读取),则将有效地将其从 BufferedInputStream的内部缓冲区,而不必降低到文件级别。因此, BI 可以减少我们从文件系统中进行实际读取的时间,完成这些操作后,即使最终用户愿意,它们也可以以高效的方式完成只能读取几个字节。



从代码中很明显, BufferedInputStream.read()会执行不是直接委托给 UnderlyingStream.read(),因为那样会绕过所有缓冲。

 公共同步int read()引发IOException {
if(pos> = count){
fill();
如果(pos> = count)
返回-1;
}
返回getBufIfOpen()[pos ++]& 0xff;
}


To start with, I understand the concept of buffering as a wrapper around, for instance, FileInuptStream to act as a temporary container for contents read(lets take read scenario) from an underlying stream, in this case - FileInputStream.

  1. Say, there are 100 bytes to read from a stream(file as a source).
  2. Without buffering, code(read method of BufferedInputStream) has to make 100 reads(one byte at a time).
  3. With buffering, depending on buffer size, code makes <= 100 reads.
  4. Lets assume buffer size to be 50.
  5. So, the code reads the buffer(as a source) only twice to read the contents of a file.
  6. Now, as the FileInuptStream is the ultimate source(though wrapped by BufferedInputStream) of data(file which contains 100 bytes), wouldn't it has to read 100 times to read 100 bytes? Though, the code calls read method of BufferedInputStream but, the call is passed to read method of FileInuptStream which needs to make 100 read calls. This is the point which I'm unable to comprehend.

IOW, though wrapped by a BufferedInputStream, the underlying streams(such as FileInputStream) still have to read one byte at a time. So, where is the benefit(not for the code which requires only two read calls to buffer but, to the application's performance) of buffering?

Thanks.

EDIT:

I'm making this as a follow-up 'edit' rather than 'comment' as I think its contextually better suits here and as a TL;DR for readers of chat between @Kayaman and me.

The read method of BufferedInputStream says(excerpt):

As an additional convenience, it attempts to read as many bytes as possible by repeatedly invoking the read method of the underlying stream. This iterated read continues until one of the following conditions becomes true:

The specified number of bytes have been read,
The read method of the underlying stream returns -1, indicating end-of-file, or
The available method of the underlying stream returns zero, indicating that further input requests would block. 

I digged into the code and found method call trace as under:

  1. BufferedInputStream -> read(byte b[]) As a I want to see buffering in action.
  2. BufferedInputStream -> read(byte b[], int off, int len)
  3. BufferedInputStream -> read1(byte[] b, int off, int len) - private
  4. FileInputStream - read(byte b[], int off, int len)
  5. FileInputStream -> readBytes(byte b[], int off, int len) - private and native. Method description from source code -

Reads a subarray as a sequence of bytes.

Call to read1(#4, above mentioned) in BufferedInputStream is in an infinite for loop. It returns on conditions mentioned in above excerpt of read method description.

As I had mentioned in OP(#6), the call does seem to handle by an underlying stream which matches API method description and method call trace.

The question still remains, if native API call - readBytes of FileInputStream reads one byte at a time and create an array of those bytes to return?

解决方案

The underlying streams(such as FileInputStream) still have to read one byte at a time

Luckily no, that would be hugely inefficient. It allows the BufferedInputStream to make read(byte[8192] buffer) calls to the FileInputStream which will return a chunk of data.

If you then want to read a single byte (or not), it will efficiently be returned from BufferedInputStream's internal buffer instead of having to go down to the file level. So the BI is there to reduce the times we do actual reads from the filesystem, and when those are done, they're done in an efficient fashion even if the end user wanted to read just a few bytes.

It's quite clear from the code that BufferedInputStream.read() does not delegate directly to UnderlyingStream.read(), as that would bypass all the buffering.

public synchronized int read() throws IOException {
    if (pos >= count) {
        fill();
        if (pos >= count)
            return -1;
    }
    return getBufIfOpen()[pos++] & 0xff;
}

这篇关于Java的BufferedInputStream的幕后花絮的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆