从文本文件中读取大行字符串 [英] Reading huge line of string from text file
问题描述
我有一个大文本文件,但没有任何换行符.它只包含一个长字符串(1 大行包含所有 ASCII 字符的字符串),但到目前为止一切正常,因为我可以在 Java 中将整行读取到内存中,但我想知道是否有内存泄漏问题,因为文件变得像 5GB+ 一样大,并且程序无法一次将整个文件读入内存,那么在这种情况下,读取此类文件的最佳方法是什么?我们可以将这条巨大的线分成 2 部分甚至是多块吗?
I have a large text file but doesn't have any line break. It just contains a long String (1 huge line of String with all ASCII characters), but so far anything works just fine as I can be able to read the whole line into memory in Java, but i am wondering if there could be a memory leak issue as the file becomes so big like 5GB+ and the program can't read the whole file into memory at once, so in that case what will be the best way to read such file ? Can we break the huge line into 2 parts or even multiple chunks ?
这是我阅读文件的方式
BufferedReader buf = new BufferedReader(new FileReader("input.txt"));
String line;
while((line = buf.readLine()) != null){
}
推荐答案
单个字符串的长度只能为 20 亿个字符,并且每个字符将使用 2 个字节,因此如果您可以读取 5 GB 的行,它将使用 10 GB记忆.
A single String can be only 2 billion characters long and will use 2 byte per character, so if you could read a 5 GB line it would use 10 GB of memory.
我建议你分块阅读文本.
I suggest you read the text in blocks.
Reader reader = new FileReader("input.txt");
try {
char[] chars = new char[8192];
for(int len; (len = reader.read(chars)) > 0;) {
// process chars.
}
} finally {
reader.close();
}
无论文件大小如何,这都将使用大约 16 KB.
This will use about 16 KB regardless of the size of the file.
这篇关于从文本文件中读取大行字符串的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!