如何知道BufferedReader的字节读取(偏移)? [英] How to know bytes read(offset) of BufferedReader?
问题描述
我想逐行阅读文件。
BufferedReader比RandomAccessFile或BufferedInputStream快得多。
但问题是我不知道我读了多少字节。
如何知道读取的字节数(偏移量)?
我试过了。
I want to read file line by line. BufferedReader is much faster than RandomAccessFile or BufferedInputStream. But the problem is that I don't know how many bytes I read. How to know bytes read(offset)? I tried.
String buffer;
int offset = 0;
while ((buffer = br.readLine()) != null)
offset += buffer.getBytes().length + 1; // 1 is for line separator
如果文件很小,我就可以使用。
但是,当文件变大时,偏移量会小于实际值。
我如何获得抵消?
I works if file is small. But, when the file becomes large, offset becomes smaller than actual value. How can I get offset?
推荐答案
使用 BufferedReader
进行此操作没有简单的方法,因为两种效果:字符结束和行结尾。在Windows上,行结尾是 \ r \ n
,这是两个字节。在Unix上,行分隔符是单个字节。 BufferedReader
会在没有你注意的情况下处理这两种情况,所以在 readLine()
之后,你不知道有多少字节是跳过。
There is no simple way to do this with BufferedReader
because of two effects: Character endcoding and line endings. On Windows, the line ending is \r\n
which is two bytes. On Unix, the line separator is a single byte. BufferedReader
will handle both cases without you noticing, so after readLine()
, you won't know how many bytes were skipped.
此外 buffer.getBytes()
仅在您的默认编码和编码时返回正确的结果文件中的数据偶然碰巧是相同的。当使用 byte []
< - > String
任何类型的转换时,你应该总是确切指定应该使用哪种编码。
Also buffer.getBytes()
only returns the correct result when your default encoding and the encoding of the data in the file accidentally happens to be the same. When using byte[]
<-> String
conversion of any kind, you should always specify exactly which encoding should be used.
你也不能使用计数 InputStream
,因为缓冲的读者以大块读取数据。因此,在读取第一行,例如5个字节后,内部 InputStream
中的计数器将返回4096,因为读取器总是将多个字节读入其内部缓冲区。
You also can't use a counting InputStream
because the buffered readers read data in large chunks. So after reading the first line with, say, 5 bytes, the counter in the inner InputStream
would return 4096 because the reader always reads that many bytes into its internal buffer.
你可以看一下NIO。您可以使用低级 ByteBuffer
来跟踪偏移并将其包装在 CharBuffer
中以将输入转换为行。
You can have a look at NIO for this. You can use a low level ByteBuffer
to keep track of the offset and wrap that in a CharBuffer
to convert the input into lines.
这篇关于如何知道BufferedReader的字节读取(偏移)?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!