阅读文本文件的字符串没有巨大的内存消耗 [英] Reading text file to string without huge memory consumption

查看:94
本文介绍了阅读文本文件的字符串没有巨大的内存消耗的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我已经尝试过使用NIO(最慢读取单个文件),BufferedInputStream和读取文件行(每次平均600毫秒)读取文件的几种方法的性能,然后使用Filereader和一个固定大小的数组充当缓冲区(最快)

文件是95 MB纯文本以windows .txt文件格式。将字符转换为字符串确实是瓶颈,但是我注意到这种方法的巨大内存消耗。对于95 MB的lorem ipsum,这消耗高达1 GB的RAM。我没有找到为什么。



我尝试过没有任何效果:

通过调用发出垃圾收集器System.gc()
在方法结束之前将所有的指针变量设置为null(但是它们应该无论如何都只能在方法中定义)。

File f = f = new File(c:/Downloads/test.txt);
long oldTime = System.currentTimeMillis();
char [] cbuf = new char [8192];
StringBuilder builder = new StringBuilder();
尝试{

FileReader reader = new FileReader(f); (读取器。读取(cbuf)!= -1){
builder.append(cbuf);

while(reader.read(cbuf)!= -1)
}

reader.close();
} catch(IOException e){
// TODO自动生成的catch块
e.printStackTrace();
}
long currentTime = System.currentTimeMillis();

System.out.println(currentTime - oldTime);


解决方案

试试Apache Commons IO: a href =http://commons.apache.org/proper/commons-io/ =nofollow> http://commons.apache.org/proper/commons-io/
我没有对它进行基准测试,但是我认为这个代码已经被优化了。

I've tried to measure performance of several approaches to read a file into string using NIO (slowest for reading single file), BufferedInputStream and reading the file line after line (600 ms average per pass) and then this stream using Filereader and an array with fixed size acting as a buffer (fastest)

File was 95 MB of pure text in windows .txt file format. Converting chars to string really is the bottleneck, but what I noticed is HUGE memory consumption of this method. For 95 MB of lorem ipsum, this consumes up to 1 GB of RAM. I haven't found why.

What I have tried with no effect:

Issuing Garbage Collector by calling System.gc() Setting all the pointer variables to null before method ends (but they should be anyway, they are defined only within method).

private void testCharStream() {
            File f = f = new File("c:/Downloads/test.txt");
    long oldTime = System.currentTimeMillis();
    char[] cbuf = new char[8192];
    StringBuilder builder = new StringBuilder();
    try {

        FileReader reader = new FileReader(f);

        while (reader.read(cbuf) != -1) {
            builder.append(cbuf);
        }

        reader.close();
    } catch (IOException e) {
        // TODO Auto-generated catch block
        e.printStackTrace();
    }
    long currentTime = System.currentTimeMillis();

    System.out.println(currentTime - oldTime);
}

解决方案

Try Apache Commons IO: http://commons.apache.org/proper/commons-io/ I didn't benchmark it but I think the code is optimised.

这篇关于阅读文本文件的字符串没有巨大的内存消耗的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆