管理加载ZipFile时分配的内存 [英] Managing memory allocated while loading ZipFile

查看:93
本文介绍了管理加载ZipFile时分配的内存的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试将69,930个文件加载到基本文本编辑器中.这样可以顺利进行,并且在它们全部加载之后,内存就非常酷,只有130MB.但是,在高峰加载时间,这可能会达到900MB-1200MB的最大值.

I am attempting to load 69,930 files into a basic text editor. This goes smoothly and after they are all loaded the memory sits at a very cool 130MB. However, during the peak loading time this can hit a maximum of 900MB - 1200MB.

所有内存都引用 Inflater#buf 字段.这仅用于将文件加载到对象模型中,因此不再使用,可以清除字节.

The memory is all referencing the Inflater#buf field. This is used only to load the file into the object model, then it is never used again and the bytes can be cleared.

显然,多余的内存在加载后很快就被垃圾收集器清除了-因此没有内存泄漏.但是,似乎没有必要使用这么多的额外内存.

Obviously, the extra memory is all cleared by the garbage collector soon after loading - so no memory leaks. However, it just seems unnecessary to use so much extra memory.

  1. 在关闭ZipFile之后立即通过调用System.gc()来解决"内存问题.这样会导致〜75%的线程监视时间,较高的CPU使用率和较慢的加载时间.
  2. 减少线程池数.这样可以减少影响(至300MB),但导致加载时间大大延长.
  3. WeakReference
  1. The memory issue is 'resolved' by making a System.gc() call immediately after closing the ZipFile. This results in ~75% monitor time on the threads, high CPU usage and slow load times.
  2. Reducing thread-pool-count. This reduced the impact (to 300MB) yet resulted in significantly longer load times.
  3. WeakReference

我到目前为止所拥有的:

我通过4个线程计数的线程池调用负载,每个线程池都执行相对简单的任务:

I call the load through a 4-thread-count thread pool, each one performing the relatively simple task:

// Source source = ...;
final InputStream input = source.open();

// read into object model

input.close();

在这种情况下,SourceZipFileSource,它可以完成所有读数:

The Source in this case is a ZipFileSource which does all the reading:

import java.io.IOException;
import java.io.InputStream;
import java.util.zip.ZipEntry;
import java.util.zip.ZipFile;

public class ZipFileSource implements Source {

    private final String file;
    private final String name;

    private volatile ZipFile zip;

    public ZipFileSource(final String file, final String name) {
        this.file = file;
        this.name = name;
    }

    @Override
    public InputStream open() throws IOException {
        close();

        final ZipFile zipFile = new ZipFile(file);
        final ZipEntry entry = zipFile.getEntry(name);

        final InputStream stream = new ZipFileSourceZipInputStream(zipFile.getInputStream(entry));

        this.zip = zipFile;

        return stream;
    }

    @Override    
    public void close() throws IOException {
        if (zip != null) {
            zip.close();
            zip = null;
        }
    }

    private class ZipFileSourceZipInputStream extends InputStream {

        private final InputStream stream;

        ZipFileSourceZipInputStream(final InputStream stream) {
            this.stream = stream;
        }

        @Override
        public int read() throws IOException {
            return stream.read();
        }

        @Override
        public void close() throws IOException {
            ZipFileSource.this.close();
            stream.close();
        }
    }
}

我的想法有些不足.我归结为要么使用本机zip提取器,要么锁定每个n请求以进行System.gc()调用,要么只是放弃并让它完成它的工作.

I'm running a bit short on ideas. I've come down to either using a native zip extractor, locking every n requests to do a System.gc() call, or just giving up and letting it do its thing.

推荐答案

A)如果您的应用程序继续运行,它将最终使用GC并在需要内存时收集这些对象.

A) if your application keeps running it will GC eventually and collect those objects when it needs the memory.

B)如果此时您的应用程序完成了……好吧……只要让VM死掉,它就会将内存释放回OS.

B) if your application is done at that point... well... just let the VM die and it'll release the memory back to the OS.

无论哪种方式,都没有真实的内存浪费".

Either way, there is no real memory "waste".

垃圾收集器的目的是分摊随时间推移的收集成本.它只能通过将其推迟到将来某个时间来做到这一点,而不是像手动管理的语言那样尝试立即free()所有操作.

The point of garbage collectors is to amortize the cost of collection over time. It can only only do that by deferring it to some point in the future instead of trying to free() everything immediately like manually managed languages would.

还请注意,您的图表仅显示已使用堆(蓝色)下降了.从操作系统的角度来看,分配的堆(橙色)始终保持不变,因此蓝色图表上的向下倾斜不会给您带来任何好处.

Also note that your chart only shows the used heap (blue) going down. From the OS perspective the allocated heap (orange) stays the same anyway, so that downwards slope on the blue chart doesn't gain you anything.

这篇关于管理加载ZipFile时分配的内存的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆