为什么StringBuilder#append(int)在Java 7中比在Java 8中更快? [英] Why is StringBuilder#append(int) faster in Java 7 than in Java 8?

查看:96
本文介绍了为什么StringBuilder#append(int)在Java 7中比在Java 8中更快?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在调查小辩论时w.r.t。使用+ n Integer.toString(int) 将整数原语转换为字符串我写了这个 JMH microbenchmark:

While investigating for a little debate w.r.t. using "" + n and Integer.toString(int) to convert an integer primitive to a string I wrote this JMH microbenchmark:

@Fork(1)
@OutputTimeUnit(TimeUnit.MILLISECONDS)
@State(Scope.Benchmark)
public class IntStr {
    protected int counter;


    @GenerateMicroBenchmark
    public String integerToString() {
        return Integer.toString(this.counter++);
    }

    @GenerateMicroBenchmark
    public String stringBuilder0() {
        return new StringBuilder().append(this.counter++).toString();
    }

    @GenerateMicroBenchmark
    public String stringBuilder1() {
        return new StringBuilder().append("").append(this.counter++).toString();
    }

    @GenerateMicroBenchmark
    public String stringBuilder2() {
        return new StringBuilder().append("").append(Integer.toString(this.counter++)).toString();
    }

    @GenerateMicroBenchmark
    public String stringFormat() {
        return String.format("%d", this.counter++);
    }

    @Setup(Level.Iteration)
    public void prepareIteration() {
        this.counter = 0;
    }
}

我用两个Java的默认JMH选项运行它我的Linux机器上存在的VM(最新的Mageia 4 64位,Intel i7-3770 CPU,32GB RAM)。第一个JVM是Oracle JDK
8u5 64位提供的JVM:

I ran it with the default JMH options with both Java VMs that exist on my Linux machine (up-to-date Mageia 4 64-bit, Intel i7-3770 CPU, 32GB RAM). The first JVM was the one supplied with Oracle JDK 8u5 64-bit:

java version "1.8.0_05"
Java(TM) SE Runtime Environment (build 1.8.0_05-b13)
Java HotSpot(TM) 64-Bit Server VM (build 25.5-b02, mixed mode)

有了这个JVM,我得到了我的预期:

With this JVM I got pretty much what I expected:

Benchmark                    Mode   Samples         Mean   Mean error    Units
b.IntStr.integerToString    thrpt        20    32317.048      698.703   ops/ms
b.IntStr.stringBuilder0     thrpt        20    28129.499      421.520   ops/ms
b.IntStr.stringBuilder1     thrpt        20    28106.692     1117.958   ops/ms
b.IntStr.stringBuilder2     thrpt        20    20066.939     1052.937   ops/ms
b.IntStr.stringFormat       thrpt        20     2346.452       37.422   ops/ms

即使用 StringBuilder 类的速度较慢,因为创建 StringBuilder 对象并追加空字符串会产生额外的开销。使用 String.format(String,...)甚至更慢,大约一个数量级。

I.e. using the StringBuilder class is slower due to the additional overhead of creating the StringBuilder object and appending an empty string. Using String.format(String, ...) is even slower, by an order of magnitude or so.

另一方面,分发提供的编译器基于OpenJDK 1.7:

The distribution-provided compiler, on the other hand, is based on OpenJDK 1.7:

java version "1.7.0_55"
OpenJDK Runtime Environment (mageia-2.4.7.1.mga4-x86_64 u55-b13)
OpenJDK 64-Bit Server VM (build 24.51-b03, mixed mode)

此处的结果有趣

Benchmark                    Mode   Samples         Mean   Mean error    Units
b.IntStr.integerToString    thrpt        20    31249.306      881.125   ops/ms
b.IntStr.stringBuilder0     thrpt        20    39486.857      663.766   ops/ms
b.IntStr.stringBuilder1     thrpt        20    41072.058      484.353   ops/ms
b.IntStr.stringBuilder2     thrpt        20    20513.913      466.130   ops/ms
b.IntStr.stringFormat       thrpt        20     2068.471       44.964   ops/ms

为什么这个JVM的 StringBuilder.append(int)显得更快?查看 StringBuilder 类源代码显示没什么特别有趣的 - 所讨论的方法几乎与 Integer#toString(int)。有趣的是,附加 Integer.toString(int) stringBuilder2 microbenchmark)的结果似乎不会更快。

Why does StringBuilder.append(int) appear so much faster with this JVM? Looking at the StringBuilder class source code revealed nothing particularly interesting - the method in question is almost identical to Integer#toString(int). Interestingly enough, appending the result of Integer.toString(int) (the stringBuilder2 microbenchmark) does not appear to be faster.

此性能差异是测试工具的问题吗?或者我的OpenJDK JVM是否包含会影响此特定代码的优化(反) - 模式?

Is this performance discrepancy an issue with the testing harness? Or does my OpenJDK JVM contain optimizations that would affect this particular code (anti)-pattern?

编辑:

为了更直接的比较,我安装了Oracle JDK 1.7u55:

For a more straight-forward comparison, I installed Oracle JDK 1.7u55:

java version "1.7.0_55"
Java(TM) SE Runtime Environment (build 1.7.0_55-b13)
Java HotSpot(TM) 64-Bit Server VM (build 24.55-b03, mixed mode)

结果类似于OpenJDK:

The results are similar to those of OpenJDK:

Benchmark                    Mode   Samples         Mean   Mean error    Units
b.IntStr.integerToString    thrpt        20    32502.493      501.928   ops/ms
b.IntStr.stringBuilder0     thrpt        20    39592.174      428.967   ops/ms
b.IntStr.stringBuilder1     thrpt        20    40978.633      544.236   ops/ms

看来这是更通用的Java 7与Java 8问题。也许Java 7有更激进的字符串优化?

It seems that this is a more general Java 7 vs Java 8 issue. Perhaps Java 7 had more aggressive string optimizations?

编辑2

对于完整性,以下是这两个JVM的字符串相关VM选项:

For completeness, here are the string-related VM options for both of these JVMs:

对于Oracle JDK 8u5:

For Oracle JDK 8u5:

$ /usr/java/default/bin/java -XX:+PrintFlagsFinal 2>/dev/null | grep String
     bool OptimizeStringConcat                      = true            {C2 product}
     intx PerfMaxStringConstLength                  = 1024            {product}
     bool PrintStringTableStatistics                = false           {product}
    uintx StringTableSize                           = 60013           {product}

对于OpenJDK 1.7:

For OpenJDK 1.7:

$ java -XX:+PrintFlagsFinal 2>/dev/null | grep String
     bool OptimizeStringConcat                      = true            {C2 product}        
     intx PerfMaxStringConstLength                  = 1024            {product}           
     bool PrintStringTableStatistics                = false           {product}           
    uintx StringTableSize                           = 60013           {product}           
     bool UseStringCache                            = false           {product}   

UseStringCache 选项在Java 8中被删除而没有替换,所以我怀疑这有什么不同。其余选项似乎具有相同的设置。

The UseStringCache option was removed in Java 8 with no replacement, so I doubt that makes any difference. The rest of the options appear to have the same settings.

编辑3:

AbstractStringBuilder StringBuilder <* c $ c> src.zip 文件中的整数类显示没有任何内容。除了大量的修饰和文档更改之外, Integer 现在对无符号整数有一些支持,而 StringBuilder 已略有支持重构以与 StringBuffer 共享更多代码。这些更改似乎都不会影响 StringBuilder#append(int)使用的代码路径,尽管我可能遗漏了一些内容。

A side-by-side comparison of the source code of the AbstractStringBuilder, StringBuilder and Integer classes from the src.zip file of reveals nothing noteworty. Apart from a whole lot of cosmetic and documentation changes, Integer now has some support for unsigned integers and StringBuilder has been slightly refactored to share more code with StringBuffer. None of these changes seem to affect the code paths used by StringBuilder#append(int), although I may have missed something.

IntStr#integerToString() IntStr#stringBuilder0()生成的汇编代码的比较是更有趣。为 IntStr#integerToString()生成的代码的基本布局对于两个JVM都是类似的,尽管Oracle JDK 8u5似乎更具侵略性w.r.t.在 Integer#toString(int)代码中内联一些调用。与Java源代码有明确的对应关系,即使对于具有最小汇编经验的人也是如此。

A comparison of the assembly code generated for IntStr#integerToString() and IntStr#stringBuilder0() is far more interesting. The basic layout of the code generated for IntStr#integerToString() was similar for both JVMs, although Oracle JDK 8u5 seemed to be more aggressive w.r.t. inlining some calls within the Integer#toString(int) code. There was a clear correspondence with the Java source code, even for someone with minimal assembly experience.

的汇编代码IntStr#stringBuilder0()<然而,/ code>完全不同。 Oracle JDK 8u5生成的代码再次与Java源代码直接相关 - 我可以轻松识别相同的布局。相反,OpenJDK 7生成的代码几乎无法识别未经训练的眼睛(就像我的一样)。似乎删除了新的StringBuilder()调用,就像在 StringBuilder 构造函数中创建数组一样。另外,反汇编程序插件无法像在JDK 8中那样提供源代码的引用。

The assembly code for IntStr#stringBuilder0(), however, was radically different. The code generated by Oracle JDK 8u5 was once again directly related to the Java source code - I could easily recognise the same layout. On the contrary, the code generated by OpenJDK 7 was almost unrecognisable to the untrained eye (like mine). The new StringBuilder() call was seemingly removed, as was the creation of the array in the StringBuilder constructor. Additionaly, the disassembler plugin was not able to provide as many references to the source code as it did in JDK 8.

我认为这可能是一个很大的结果OpenJDK 7中更激进的优化传递,或者更可能是为某些 StringBuilder 操作插入手写的低级代码的结果。我不确定为什么在我的JVM 8实现中没有发生这种优化,或者为什么在JVM 7中没有为 Integer#toString(int)实现相同的优化。我猜有人熟悉与JRE源代码的相关部分必须回答这些问题......

I assume that this is either the result of a much more aggressive optimization pass in OpenJDK 7, or more probably the result of inserting hand-written low-level code for certain StringBuilder operations. I am unsure why this optimization does not happen in my JVM 8 implementation or why the same optimizations were not implemented for Integer#toString(int) in JVM 7. I guess someone familiar with the related parts of the JRE source code would have to answer these questions...

推荐答案

TL; DR: 中的副作用附加显然会破坏StringConcat优化。

TL;DR: Side effects in append apparently break StringConcat optimizations.

原始问题和更新中的非常好的分析!

Very good analysis in the original question and updates!

为了完整性,以下是一些缺失的步骤:

For completeness, below are a few missing steps:


  • 查看7u55和8u5的 -XX:+ PrintInlining 。在7u55,您将看到如下内容:

  • See through the -XX:+PrintInlining for both 7u55 and 8u5. In 7u55, you will see something like this:


 @ 16   org.sample.IntStr::inlineSideEffect (25 bytes)   force inline by CompilerOracle
   @ 4   java.lang.StringBuilder::<init> (7 bytes)   inline (hot)
   @ 18   java.lang.StringBuilder::append (8 bytes)   already compiled into a big method
   @ 21   java.lang.StringBuilder::toString (17 bytes)   inline (hot)


...以及8u5:


 @ 16   org.sample.IntStr::inlineSideEffect (25 bytes)   force inline by CompilerOracle
   @ 4   java.lang.StringBuilder::<init> (7 bytes)   inline (hot)
     @ 3   java.lang.AbstractStringBuilder::<init> (12 bytes)   inline (hot)
       @ 1   java.lang.Object::<init> (1 bytes)   inline (hot)
   @ 18   java.lang.StringBuilder::append (8 bytes)   inline (hot)
     @ 2   java.lang.AbstractStringBuilder::append (62 bytes)   already compiled into a big method
   @ 21   java.lang.StringBuilder::toString (17 bytes)   inline (hot)
     @ 13   java.lang.String::<init> (62 bytes)   inline (hot)
       @ 1   java.lang.Object::<init> (1 bytes)   inline (hot)
       @ 55   java.util.Arrays::copyOfRange (63 bytes)   inline (hot)
         @ 54   java.lang.Math::min (11 bytes)   (intrinsic)
         @ 57   java.lang.System::arraycopy (0 bytes)   (intrinsic)


您可能会注意到7u55版本较浅,看起来在 StringBuilder 方法之后没有调用任何内容 - 这是一个很好的指示字符串优化有效。实际上,如果你使用 -XX:-OptimizeStringConcat 运行7u55,子包将重新出现,性能下降到8u5级别。

You might notice that 7u55 version is shallower, and it looks like nothing is called after StringBuilder methods -- this is a good indication the string optimizations are in effect. Indeed, if you run 7u55 with -XX:-OptimizeStringConcat, the subcalls will reappear, and performance drops to 8u5 levels.

好的,所以我们需要弄清楚为什么8u5没有做同样的优化。为StringBuildergrep http://hg.openjdk.java.net/jdk9/jdk9/hotspot 找出VM处理StringConcat优化的位置;这将让你进入 src / share / vm / opto / stringopts.cpp

OK, so we need to figure out why 8u5 does not do the same optimization. Grep http://hg.openjdk.java.net/jdk9/jdk9/hotspot for "StringBuilder" to figure out where VM handles the StringConcat optimization; this will get you into src/share/vm/opto/stringopts.cpp

hg log src / share / vm / opto / stringopts.cpp 找出那里的最新变化。其中一位候选人将是:

hg log src/share/vm/opto/stringopts.cpp to figure out the latest changes there. One of the candidates would be:


changeset:   5493:90abdd727e64
user:        iveresov
date:        Wed Oct 16 11:13:15 2013 -0700
summary:     8009303: Tiered: incorrect results in VM tests stringconcat...



  • 在OpenJDK邮件列表上查找评论主题(谷歌的变更集摘要很容易):http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2013 -October / 012084.html

    SpotString concat优化优化将模式[...]折叠为一个字符串的单个分配,直接形成结果。优化代码中可能发生的所有可能的deopts从头开始重新启动此模式(从StringBuffer分配开始)。这意味着整个模式必须使我免于副作用。尤里卡?

    Spot "String concat optimization optimization collapses the pattern [...] into a single allocation of a string and forming the result directly. All possible deopts that may happen in the optimized code restart this pattern from the beginning (starting from the StringBuffer allocation). That means that the whole pattern must me side-effect free." Eureka?

    写出对比鲜明的基准:


    @Fork(5)
    @Warmup(iterations = 5)
    @Measurement(iterations = 5)
    @BenchmarkMode(Mode.AverageTime)
    @OutputTimeUnit(TimeUnit.NANOSECONDS)
    @State(Scope.Benchmark)
    public class IntStr {
        private int counter;
    
        @GenerateMicroBenchmark
        public String inlineSideEffect() {
            return new StringBuilder().append(counter++).toString();
        }
    
        @GenerateMicroBenchmark
        public String spliceSideEffect() {
            int cnt = counter++;
            return new StringBuilder().append(cnt).toString();
        }
    }
    



  • 在JDK 7u55上进行测量,看到内联/拼接副作用的性能相同:

  • Measure it on JDK 7u55, seeing the same performance for inlined/spliced side effects:


    Benchmark                       Mode   Samples         Mean   Mean error    Units
    o.s.IntStr.inlineSideEffect     avgt        25       65.460        1.747    ns/op
    o.s.IntStr.spliceSideEffect     avgt        25       64.414        1.323    ns/op
    



  • 在JDK 8u5上进行测量,看到内联效果导致性能下降:

  • Measure it on JDK 8u5, seeing the performance degradation with the inlined effect:


    Benchmark                       Mode   Samples         Mean   Mean error    Units
    o.s.IntStr.inlineSideEffect     avgt        25       84.953        2.274    ns/op
    o.s.IntStr.spliceSideEffect     avgt        25       65.386        1.194    ns/op
    



  • 提交错误报告(https://bugs.openjdk.java.net/browse/JDK-8043677 )与VM人讨论这种行为。原始修复的基本原理是坚如磐石的,但是如果我们能够/应该在这样的一些微不足道的案例中找回这种优化,这很有意思。

  • Submit the bug report (https://bugs.openjdk.java.net/browse/JDK-8043677) to discuss this behavior with VM guys. The rationale for original fix is rock solid, it is interesting however if we can/should get back this optimization in some trivial cases like these.

    ???

    利润。

    是的,我应该发布基准测试的结果,该结果从 StringBuilder 链中移动增量,在整个链之前完成。此外,切换到平均时间和ns / op。这是JDK 7u55:

    And yeah, I should post the results for the benchmark which moves the increment from the StringBuilder chain, doing it before the entire chain. Also, switched to average time, and ns/op. This is JDK 7u55:


    Benchmark                      Mode   Samples         Mean   Mean error    Units
    o.s.IntStr.integerToString     avgt        25      153.805        1.093    ns/op
    o.s.IntStr.stringBuilder0      avgt        25      128.284        6.797    ns/op
    o.s.IntStr.stringBuilder1      avgt        25      131.524        3.116    ns/op
    o.s.IntStr.stringBuilder2      avgt        25      254.384        9.204    ns/op
    o.s.IntStr.stringFormat        avgt        25     2302.501      103.032    ns/op
    


    这是8u5:


    Benchmark                      Mode   Samples         Mean   Mean error    Units
    o.s.IntStr.integerToString     avgt        25      153.032        3.295    ns/op
    o.s.IntStr.stringBuilder0      avgt        25      127.796        1.158    ns/op
    o.s.IntStr.stringBuilder1      avgt        25      131.585        1.137    ns/op
    o.s.IntStr.stringBuilder2      avgt        25      250.980        2.773    ns/op
    o.s.IntStr.stringFormat        avgt        25     2123.706       25.105    ns/op
    


    stringFormat 在8u5中实际上要快一点,所有其他测试都是相同。这巩固了假设在原始问题中主要罪魁祸首SB链中的副作用破坏。

    stringFormat is actually a bit faster in 8u5, and all other tests are the same. This solidifies the hypothesis the side-effect breakage in SB chains in the major culprit in the original question.

    这篇关于为什么StringBuilder#append(int)在Java 7中比在Java 8中更快?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

  • 查看全文
    登录 关闭
    扫码关注1秒登录
    发送“验证码”获取 | 15天全站免登陆