使用StringBuffer,StringBuilder和String.intern()通过String优化Java堆使用 [英] Optimizing java heap usage by String using StringBuffer , StringBuilder , String.intern()

查看:102
本文介绍了使用StringBuffer,StringBuilder和String.intern()通过String优化Java堆使用的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用VisualVM监视大型Java应用程序的性能和CPU. 当我查看其内存配置文件时,我发现char数组正在耗尽最大堆内存(大约50%).

I am monitoring the performance and CPU of a large java application , using VisualVM. When I look at its memory profile I see maximum heap (about 50%) is being used up by char arrays.

以下是内存配置文件的屏幕截图:

Following is a screenshot of the memory profile:

在任何给定时间的内存配置文件中,我大约看到9000个char []对象.

In the memory profile at any given time i see roughly about 9000 char[] objects.

应用程序接受一个大文件作为输入.该文件大约有80行,每行包含15-20个分隔的配置选项.应用程序解析文件,并将这些行存储在String的ArrayList中.然后,解析这些字符串以获取每个服务器的单独配置选项.

The application accepts a large file as input. The file roughly has about 80 lines each line consisting of 15-20 delimited config options. The application parses the file and stores these lines in a ArrayList of Strings. It then parses these string to get the individual config options for each server.

应用程序还经常将每个事件记录到控制台.

The application also frequently logs each event to the console.

字符串的Java实现在内部使用char []以及对数组和3整数的引用.

从互联网上的不同帖子看来,StringBuffer,StringBuilder,String.intern()似乎是内存使用效率更高的数据类型.

它们与java.lang.String相比如何?有人对它们进行了基准测试吗?如果应用程序使用多线程(确实如此),它们是否是安全的替代选择?

How do they compare to java.lang.String ? Has anybody benchmarked them ? If the application uses multithreading (which it does)are they a safe alternative ?

推荐答案

我要做的是拥有一个或多个字符串池.我这样做是为了:a)如果池中有一个,则不创建新的字符串; b)减少保留的内存大小,有时减少3-5倍.您可以自己编写一个简单的字符串助手,但是我建议您考虑一下如何首先读取数据以确定最佳解决方案.这很重要,因为如果没有有效的解决方案,很容易使事情变得更糟.

What I do is is have one or more String pools. I do this to a) not create new Strings if I have one in the pool and b) reduce the retained memory size, sometimes by a factor of 3-5. You can write a simple string interner yourself but I suggest you consider how the data is read in first to determine the optimal solution. This matters as you can easily make matters worse if you don't have an efficient solution.

正如EJP指出的那样,一次处理一行的效率更高,就像在阅读时解析每一行一样.即intdouble占用的空间比同一个String少得多(除非您有很高的重复率)

As EJP points out processing a line at a time is more efficient, as is parsing each line as you read it. i.e. an int or double takes up far less space than the same String (unless you have a very high rate of duplication)

这里是StringInterner的示例,该示例使用StringBuilder来避免不必要地创建对象.首先,用文本填充一个回收的StringBuilder,如果一个匹配该文本的String出现在interner中,则返回该String(或StringBuilder的toString()是).好处是,您仅创建对象(且不超过当您看到一个新的String(或至少一个不在数组中的String)时,这可以达到80%至99%的命中率,并且在加载许多数据字符串时可以显着减少内存消耗(和垃圾).

Here is an example of a StringInterner which takes a StringBuilder to avoid creating objects needlessly. You first populate a recycled StringBuilder with the text and if a String matching that text is in the interner, that String is returned (or a toString() of the StringBuilder is.) The benefit is that you only create objects (and no more than needed) when you see a new String (or at least one not in the array) This can get a 80% to 99% hit rate and reduce memory consumption (and garbage) dramatically when loading many strings of data.

public class StringInterner {
    @NotNull
    private final String[] interner;
    private final int mask;

    public StringInterner(int capacity) {
        int n = nextPower2(capacity, 128);
        interner = new String[n];
        mask = n - 1;
    }

    @Override
    @NotNull
    public String intern(@NotNull CharSequence cs) {
        long hash = 0;
        for (int i = 0; i < cs.length(); i++)
            hash = 57 * hash + cs.charAt(i);
        int h = hash(hash) & mask;
        String s = interner[h];
        if (isEqual(s, cs))
            return s;
        String s2 = cs.toString();
        return interner[h] = s2;
    }

    static boolean isEqual(@Nullable CharSequence s, @NotNull CharSequence cs) {
        if (s == null) return false;
        if (s.length() != cs.length()) return false;
        for (int i = 0; i < cs.length(); i++)
            if (s.charAt(i) != cs.charAt(i))
                return false;
        return true;
    }

    static int nextPower2(int n, int min) {
        if (n < min) return min;
        if ((n & (n - 1)) == 0) return n;
        int i = min;
        while (i < n) {
            i *= 2;
            if (i <= 0) return 1 << 30;
        }
        return i;
    }

    static int hash(long n) {
        n ^= (n >> 43) ^ (n >> 21);
        n ^= (n >> 15) ^ (n >> 7);
        return (int) n;
    }
}

该类很有趣,因为它在传统意义上不是线程安全的,但是在并发使用时可以正常工作,实际上,当多个线程对数组内容有不同的看法时,它可能会更有效地工作.

This class is interesting in that it is not thread safe in the tradition sense, but will work correctly when used concurrently, in fact might work more efficiently when multiple threads have different views of the contents of the array.

这篇关于使用StringBuffer,StringBuilder和String.intern()通过String优化Java堆使用的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆