Java - 调用静态方法与手动内联 - 性能开销 [英] Java - calling static methods vs manual inlining - performance overhead

查看:142
本文介绍了Java - 调用静态方法与手动内联 - 性能开销的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我感兴趣的是我是否应该手动内联在一些性能敏感算法中称为100k-100万次的小方法。

I am interested whether should I manually inline small methods which are called 100k - 1 million times in some performance-sensitive algorithm.

首先,我认为,通过由于JVM必须确定是否内联此方法(或者甚至没有这样做),因此我没有内联,因此会产生一些开销。

First, I thought that, by not inlining, I am incurring some overhead since JVM will have to find determine whether or not to inline this method (or even fail to do so).

然而,前几天,我用静态方法的调用替换了这个手动内联代码,并看到了性能提升。怎么可能?这是否表明实际上没有开销,让JVM内联意志实际上提升了性能?或者这很大程度上取决于平台/架构?

However, the other day, I replaced this manually inlined code with invocation of static methods and seen a performance boost. How is that possible? Does this suggest that there is actually no overhead and that by letting JVM inline at "its will" actually boosts performance? Or this hugely depends on the platform/architecture?

(发生性能提升的示例是替换数组交换( int t = a [ i]; a [i] = a [j]; a [j] = t; ),静态方法调用 swap(int [] a,int i,int j)。另一个没有性能差异的例子是当我内联10个方法时被称为1000000次。)

(The example in which a performance boost occurred was replacing array swapping (int t = a[i]; a[i] = a[j]; a[j] = t;) with a static method call swap(int[] a, int i, int j). Another example in which there was no performance difference was when I inlined a 10-liner method which was called 1000000 times.)

推荐答案

我见过类似的东西。 手动内联不一定更快,结果程序可能太复杂而无法进行优化分析。

I have seen something similar. "Manual inlining" isn't necessarily faster, the result program can be too complex for optimizer to analyze.

在您的示例中,让我们做一些猜测。当您使用swap()方法时,JVM可能能够分析方法体,并得出结论,由于i和j不会改变,尽管有4个数组访问,但只需要2个范围检查而不是4个。本地变量 t 不是必需的,JVM可以使用2个寄存器来完成工作,而不涉及 t on stack。

In your example let's make some wild guesses. When you use the swap() method, JVM may be able to analyze the method body, and conclude that since i and j don't change, although there are 4 array accesses, only 2 range checks are needed instead of 4. Also the local variable t isn't necessary, JVM can use 2 registers to do the job, without involving r/w of t on stack.

稍后,swap()的主体被内联到调用方法中。这是在上一次优化之后,因此保存仍然存在。调用者方法体甚至可能证明i和j总是在范围内,因此剩下的2个范围检查也被删除。

Later, the body of swap() is inlined into the caller method. That is after the previous optimization, so the saves are still in place. It's even possible that caller method body has proved that i and j are always within range, so the 2 remaining range checks are also dropped.

现在在手动内联版本中,优化器必须立即分析整个程序,变量太多,动作太多,可能无法证明保存范围检查是安全的,或者消除局部变量 t 。在最坏的情况下,这个版本可能需要花费6个以上的内存访问来进行交换,这是巨大的开销。即使只有1个额外的内存读取,它仍然非常明显。

Now in the manually inlined version, the optimizer has to analyze the whole program at once, there are too many variables and too many actions, it may fail to prove that it's safe to save range checks, or eliminate the local variable t. In the worst case this version may cost 6 more memory accesses to do the swap, which is a huge overhead. Even if there is only 1 extra memory read, it is still very noticeable.

当然,我们没有理由认为手动概述总是更好,即提取小方法,如愿以为它会帮助优化器。

Of course, we have no basis to believe that it's always better to do manual "outlining", i.e. extract small methods, wishfully thinking that it will help the optimizer.

-

我学到的是,忘记手动微优化。并不是我不关心微观性能改进,而是我始终信任JVM的优化。这是我完全不知道该做什么比做坏事更好。所以我放弃了。

What I've learned is that, forget manual micro optimizations. It's not that I don't care about micro performance improvements, it's not that I always trust JVM's optimization. It is that I have absolutely no idea what to do that does more good than bad. So I gave up.

这篇关于Java - 调用静态方法与手动内联 - 性能开销的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆