Java的BufferedInputStream的幕后花絮 [英] Behind the scenes of Java's BufferedInputStream
问题描述
首先,我了解缓冲作为包装的概念,例如 FileInuptStream
用作内容读取的临时容器(以读取场景为例)来自基础流,在这种情况下为 FileInputStream
。
- 说,有从流中读取100个字节(作为文件的源)。
- 没有缓冲,代码(<$ c $的读取方法(
read
方法) c> BufferedInputStream )必须进行100次读取(一次1个字节)。 - 使用缓冲,取决于缓冲区大小,代码使< = 100次读取
- 让我们假设缓冲区大小为50。
- 因此,代码仅读取缓冲区(作为源)两次,以读取其中的内容。
- 现在,由于
FileInuptStream
是最终来源(尽管由BufferedInputStream $包裹了) c $ c>)的数据(包含100个字节的文件),不是要读取100次才能读取100个字节吗?虽然,代码会调用
BufferedInputStream
的read
方法,但是,该调用将传递给read <
FileInuptStream
的/ code>方法需要进行100次读取调用。这是我无法理解的重点。
IOW,尽管由 BufferedInputStream
包裹,但底层流(例如 FileInputStream
)仍然必须一次读取一个字节。因此,缓冲的好处(不是仅需要两次读取调用即可缓冲的代码,对于应用程序的性能而言就不是)了?
谢谢。
编辑:
我将其作为后续的编辑而不是评论,因为我认为
读取 BufferedInputStream
说(摘录):
为方便起见,
尝试读取的字节数可以通过反复调用基础流的
读取方法来实现。此迭代读取将继续
,直到满足以下条件之一为止:已读取指定的字节数,
基础流的读取方法返回-1,表示文件结束,即
基础流的可用方法返回零,表示进一步的输入请求将被阻塞。
我深入研究代码,发现方法调用跟踪如下:
-
BufferedInputStream
->read(byte b [] )
作为一个我想看到缓冲在起作用。 -
BufferedInputStream
->read(byte b [],int off,int len)
-
BufferedInputStream
->read1(byte [] b,int off,int len)
-私有 -
FileInputStream
-
读取(字节b [],int off,int len) -
FileInputStream
->readBytes(byte b [],int off,int len)
-私有和本地。来自源代码的方法描述-
将子数组读取为字节序列。
在 BufferedInputStream <中调用
read1
(上面提到的#4) / code>在无限的 for
循环中。返回上面摘录的 read
方法说明中提到的条件。
正如我在OP中提到的那样(#6 ),该调用似乎由与API方法描述和方法调用跟踪相匹配的基础流处理。
如果本地API调用-<$ c $ p>
中的$ c> readBytes 个 FileInputStream
一次读取一个字节并创建这些字节的数组以返回?
基础流(例如
FileInputStream
)仍必须读取
一次一个字节
幸运的是,那将是非常低效的。它允许 BufferedInputStream
调用 read(byte [8192] buffer)
调用 FileInputStream
会返回一大块数据。
如果您想读取单个字节(或不读取),则将有效地将其从 BufferedInputStream的
内部缓冲区,而不必降低到文件级别。因此, BI
可以减少我们从文件系统中进行实际读取的时间,完成这些操作后,即使最终用户愿意,它们也可以以高效的方式完成只能读取几个字节。
从代码中很明显, BufferedInputStream.read()
会执行不是直接委托给 UnderlyingStream.read()
,因为那样会绕过所有缓冲。
公共同步int read()引发IOException {
if(pos> = count){
fill();
如果(pos> = count)
返回-1;
}
返回getBufIfOpen()[pos ++]& 0xff;
}
To start with, I understand the concept of buffering as a wrapper around, for instance, FileInuptStream
to act as a temporary container for contents read(lets take read scenario) from an underlying stream, in this case - FileInputStream
.
- Say, there are 100 bytes to read from a stream(file as a source).
- Without buffering, code(
read
method ofBufferedInputStream
) has to make 100 reads(one byte at a time). - With buffering, depending on buffer size, code makes <= 100 reads.
- Lets assume buffer size to be 50.
- So, the code reads the buffer(as a source) only twice to read the contents of a file.
- Now, as the
FileInuptStream
is the ultimate source(though wrapped byBufferedInputStream
) of data(file which contains 100 bytes), wouldn't it has to read 100 times to read 100 bytes? Though, the code callsread
method ofBufferedInputStream
but, the call is passed toread
method ofFileInuptStream
which needs to make 100 read calls. This is the point which I'm unable to comprehend.
IOW, though wrapped by a BufferedInputStream
, the underlying streams(such as FileInputStream
) still have to read one byte at a time. So, where is the benefit(not for the code which requires only two read calls to buffer but, to the application's performance) of buffering?
Thanks.
EDIT:
I'm making this as a follow-up 'edit' rather than 'comment' as I think its contextually better suits here and as a TL;DR for readers of chat between @Kayaman and me.
The read method of BufferedInputStream
says(excerpt):
As an additional convenience, it attempts to read as many bytes as possible by repeatedly invoking the read method of the underlying stream. This iterated read continues until one of the following conditions becomes true:
The specified number of bytes have been read, The read method of the underlying stream returns -1, indicating end-of-file, or The available method of the underlying stream returns zero, indicating that further input requests would block.
I digged into the code and found method call trace as under:
BufferedInputStream
->read(byte b[])
As a I want to see buffering in action.BufferedInputStream
->read(byte b[], int off, int len)
BufferedInputStream
->read1(byte[] b, int off, int len)
- privateFileInputStream
- read(byte b[], int off, int len)FileInputStream
->readBytes(byte b[], int off, int len)
- private and native. Method description from source code -
Reads a subarray as a sequence of bytes.
Call to read1
(#4, above mentioned) in BufferedInputStream
is in an infinite for
loop. It returns on conditions mentioned in above excerpt of read
method description.
As I had mentioned in OP(#6), the call does seem to handle by an underlying stream which matches API method description and method call trace.
The question still remains, if native API call - readBytes
of FileInputStream
reads one byte at a time and create an array of those bytes to return?
The underlying streams(such as
FileInputStream
) still have to read one byte at a time
Luckily no, that would be hugely inefficient. It allows the BufferedInputStream
to make read(byte[8192] buffer)
calls to the FileInputStream
which will return a chunk of data.
If you then want to read a single byte (or not), it will efficiently be returned from BufferedInputStream's
internal buffer instead of having to go down to the file level. So the BI
is there to reduce the times we do actual reads from the filesystem, and when those are done, they're done in an efficient fashion even if the end user wanted to read just a few bytes.
It's quite clear from the code that BufferedInputStream.read()
does not delegate directly to UnderlyingStream.read()
, as that would bypass all the buffering.
public synchronized int read() throws IOException {
if (pos >= count) {
fill();
if (pos >= count)
return -1;
}
return getBufIfOpen()[pos++] & 0xff;
}
这篇关于Java的BufferedInputStream的幕后花絮的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!