如何知道BufferedReader的字节读取(偏移)? [英] How to know bytes read(offset) of BufferedReader?

查看:351
本文介绍了如何知道BufferedReader的字节读取(偏移)?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想逐行阅读文件。
BufferedReader比RandomAccessFile或BufferedInputStream快得多。
但问题是我不知道我读了多少字节。
如何知道读取的字节数(偏移量)?
我试过了。

I want to read file line by line. BufferedReader is much faster than RandomAccessFile or BufferedInputStream. But the problem is that I don't know how many bytes I read. How to know bytes read(offset)? I tried.

String buffer;
int offset = 0;

while ((buffer = br.readLine()) != null)
    offset += buffer.getBytes().length + 1; // 1 is for line separator

如果文件很小,我就可以使用。
但是,当文件变大时,偏移量会小于实际值。
我如何获得抵消?

I works if file is small. But, when the file becomes large, offset becomes smaller than actual value. How can I get offset?

推荐答案

使用 BufferedReader 进行此操作没有简单的方法,因为两种效果:字符结束和行结尾。在Windows上,行结尾是 \ r \ n ,这是两个字节。在Unix上,行分隔符是单个字节。 BufferedReader 会在没有你注意的情况下处理这两种情况,所以在 readLine()之后,你不知道有多少字节是跳过。

There is no simple way to do this with BufferedReader because of two effects: Character endcoding and line endings. On Windows, the line ending is \r\n which is two bytes. On Unix, the line separator is a single byte. BufferedReader will handle both cases without you noticing, so after readLine(), you won't know how many bytes were skipped.

此外 buffer.getBytes()仅在您的默认编码和编码时返回正确的结果文件中的数据偶然碰巧是相同的。当使用 byte [] < - > String 任何类型的转换时,你应该总是确切指定应该使用哪种编码。

Also buffer.getBytes() only returns the correct result when your default encoding and the encoding of the data in the file accidentally happens to be the same. When using byte[] <-> String conversion of any kind, you should always specify exactly which encoding should be used.

你也不能使用计数 InputStream ,因为缓冲的读者以大块读取数据。因此,在读取第一行,例如5个字节后,内部 InputStream 中的计数器将返回4096,因为读取器总是将多个字节读入其内部缓冲区。

You also can't use a counting InputStream because the buffered readers read data in large chunks. So after reading the first line with, say, 5 bytes, the counter in the inner InputStream would return 4096 because the reader always reads that many bytes into its internal buffer.

你可以看一下NIO。您可以使用低级 ByteBuffer 来跟踪偏移并将其包装在 CharBuffer 中以将输入转换为行。

You can have a look at NIO for this. You can use a low level ByteBuffer to keep track of the offset and wrap that in a CharBuffer to convert the input into lines.

这篇关于如何知道BufferedReader的字节读取(偏移)?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆