如何在不使用Java进行任何缓冲的情况下读取文件? [英] How do I read a file without any buffering in Java?

查看:99
本文介绍了如何在不使用Java进行任何缓冲的情况下读取文件?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在解决Programming Pearls,第二版,第1列中的问题.其中一个问题涉及编写一个程序,该程序仅使用大约1 MB的内存来将文件的内容存储为位数组,每一位表示文件中是否存在7位数字.由于Java是我最熟悉的语言,因此即使作者似乎已经想到了C和C ++,我还是决定使用它.

I'm working through the problems in Programming Pearls, 2nd edition, Column 1. One of the problems involves writing a program that uses only around 1 megabyte of memory to store the contents of a file as a bit array with each bit representing whether or not a 7 digit number is present in the file. Since Java is the language I'm the most familiar with, I've decided to use it even though the author seems to have had C and C++ in mind.

由于我正在假装由于我正在处理的问题而受到限制,所以我想确保读取文件的过程根本没有缓冲.

Since I'm pretending memory is limited for the purpose of the problem I'm working on, I'd like to make sure the process of reading the file has no buffering at all.

我认为InputStreamReader是一个很好的解决方案,直到我在

I thought InputStreamReader would be a good solution, until I read this in the Java documentation:

为了能够将字节有效地转换为字符,与满足当前读取操作所需的字节数相比,可以从基础流中提前读取更多的字节.

To enable the efficient conversion of bytes to characters, more bytes may be read ahead from the underlying stream than are necessary to satisfy the current read operation.

理想情况下,只会从流中读取所需的字节-换句话说,我不需要任何缓冲.

Ideally, only the bytes that are necessary would be read from the stream -- in other words, I don't want any buffering.

推荐答案

问题之一涉及编写仅使用大约1兆字节内存的程序来将文件内容存储为位数组,每个位代表文件中是否存在7位数字.

One of the problems involves writing a program that uses only around 1 megabyte of memory to store the contents of a file as a bit array with each bit representing whether or not a 7 digit number is present in the file.

这意味着您需要以字节(不是字符)的形式读取文件.

This implies that you need to read the file as bytes (not characters).

假设您确实有真正的要求,无需缓冲即可读取文件,则应使用FileInputStream类.它没有缓冲.它会精确地读取(或尝试读取)您要求的字节数.

Assuming that you do have a genuine requirement to read from a file without buffering, then you should use the FileInputStream class. It does no buffering. It reads (or attempts to read) precisely the number of bytes that you asked for.

如果您随后需要将这些字节转换为字符,则可以通过将适当的String构造函数应用于bytebyte[]来实现.请注意,对于诸如UTF-8的多字节字符编码,您将需要读取足够的字节以完成每个字符.做到没有预读的可能性有点棘手……并且需要您正在阅读的字符编码的知识*".

If you then need to convert those bytes to characters, you could do this by applying the appropriate String constructor to a byte or byte[]. Note that for multibyte character encodings such as UTF-8, you would need to read sufficient bytes to complete each character. Doing that without the possibility of read-ahead is a bit tricky ... and entails "knowledge* of the character encoding you are reading.

(您可以直接使用CharsetDecoder来避免该知识.但是随后您需要使用对Buffer对象进行操作的decode方法,这也有些复杂.)

(You could avoid that knowledge by using a CharsetDecoder directly. But then you'd need to use the decode method that operates on Buffer objects, and that is a bit complicated too.)

对于它的价值,Java在字节流和字符流I/O之间进行了清晰区分.前者由InputStreamOutputStream支持,而后者由ReaderWrite支持. InputStreamReader类是Reader,而 adapts InputStream.您不应该考虑将其用于想要按字节读取内容的应用程序.

For what it is worth, Java makes a clear distinction between stream-of-byte and stream-of-character I/O. The former is supported by InputStream and OutputStream, and the latter by Reader and Write. The InputStreamReader class is a Reader, that adapts an InputStream. You should not be considering using it for an application that wants to read stuff byte-wise.

这篇关于如何在不使用Java进行任何缓冲的情况下读取文件?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆