我可以在Java代码中做些什么来优化CPU缓存? [英] What can I do in Java code to optimize for CPU caching?

查看:151
本文介绍了我可以在Java代码中做些什么来优化CPU缓存?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

编写Java程序时,我是否会影响CPU如何利用其缓存来存储数据?例如,如果我有一个被大量访问的数组,如果它足够小以适应一个缓存行(通常是64位机器上的128字节),它会有帮助吗?如果我将一个使用频繁的对象保持在该限制范围内,我可以期待它的成员使用的内存靠近并保持缓存吗?



背景:我是构建一个压缩的数字树,受到 Judy数组,在C中。虽然我主要使用节点压缩技术,但Judy将CPU缓存优化作为中心设计目标,节点类型以及它们之间切换的启发式受到很大影响。我想知道我是否有机会获得这些好处?



编辑到目前为止答案的一般建议是的,当你离开机器时,不要试图微机械化机器级细节,就像你在Java中一样。我完全同意,所以觉得我必须添加一些(希望)澄清的评论,以更好地解释为什么我认为这个问题仍然有意义。它们如下所示:



由于它们的构建方式,计算机通常更容易处理一些事情。我已经看到Java代码在压缩数据(来自内存)上的运行速度明显更快,即使解压缩必须使用额外的CPU周期。如果数据存储在磁盘上,很明显为什么会这样,但当然在RAM中它的原理是相同的。



现在,计算机科学对这些东西有很多话要说,例如,C语言的参考位置很好,我猜它在Java中仍然很棒,甚至可能更重要的是,如果它有助于优化运行时做更聪明的事情。但是你如何实现它可能会有很大的不同。在C中,我可能会编写管理更大内存块的代码,并使用相邻指针来获取相关数据。



在Java中,我不能(并且不想要to)非常了解特定运行时将如何管理内存。因此,我必须对更高级别的抽象进行优化。我的问题基本上是,我该怎么做?对于引用的局部性,在一起是什么意思在我在Java中工作的抽象层次?相同的对象?相同类型?相同的数组?



一般来说,我不认为抽象层会改变物理定律,比喻说。即使你不再调用 malloc(),每次空间不足时将数组加倍也是一个很好的策略。

解决方案

使用Java获得良好性能的关键是编写惯用代码,而不是尝试智胜JIT编译器。如果你编写代码试图影响它在本地指令级别以某种方式做事情,那么你更有可能在脚下射击自己。



那并不是说像参考地点这样的共同原则无关紧要。他们这样做,但我会考虑使用数组等,这是性能感知的惯用代码,但不是棘手的。



HotSpot和其他优化运行时非常聪明地了解他们如何优化特定处理器的代码。 (例如,查看此讨论。)如果我是一名专业的机器语言程序员,我会编写机器语言,而不是Java。如果我不是,那么认为我可以比专家更好地优化我的代码是不明智的。



此外,即使你知道对于特定CPU实现某些功能的最佳方式,Java的优点在于可以随处执行。 优化Java代码的聪明技巧往往会使JIT难以识别优化机会。符合常用习语的直接代码更易于识别。因此,即使您为测试平台获得了最佳的Java代码,该代码也可能在不同的架构上表现可怕,或者至多在未来的JIT中无法利用增强功能。



<如果你想要良好的表现,请保持简单。 真正聪明人的团队正在努力加快速度。


When writing a Java program, do I have influence on how the CPU will utilize its cache to store my data? For example, if I have an array that is accessed a lot, does it help if it's small enough to fit in one cache line (typically 128 byte on a 64-bit machine)? What if I keep a much used object within that limit, can I expect the memory used by it's members to be close together and staying in cache?

Background: I'm building a compressed digital tree, that's heavily inspired by the Judy arrays, which are in C. While I'm mostly after its node compression techniques, Judy has CPU cache optimization as a central design goal and the node types as well as the heuristics for switching between them are heavily influenced by that. I was wondering if I have any chance of getting those benefits, too?

Edit: The general advice of the answers so far is, don't try to microoptimize machine-level details when you're so far away from the machine as you're in Java. I totally agree, so felt I had to add some (hopefully) clarifying comments, to better explain why I think the question still makes sense. These are below:

There are some things that are just generally easier for computers to handle because of the way they are built. I have seen Java code run noticeably faster on compressed data (from memory), even though the decompression had to use additional CPU cycles. If the data were stored on disk, it's obvious why that is so, but of course in RAM it's the same principle.

Now, computer science has lots to say about what those things are, for example, locality of reference is great in C and I guess it's still great in Java, maybe even more so, if it helps the optimizing runtime to do more clever things. But how you accomplish it might be very different. In C, I might write code that manages larger chunks of memory itself and uses adjacent pointers for related data.

In Java, I can't (and don't want to) know much about how memory is going to be managed by a particular runtime. So I have to take optimizations to a higher level of abstraction, too. My question is basically, how do I do that? For locality of reference, what does "close together" mean at the level of abstraction I'm working on in Java? Same object? Same type? Same array?

In general, I don't think that abstraction layers change the "laws of physics", metaphorically speaking. Doubling your array in size every time you run out of space is a good strategy in Java, too, even though you don't call malloc() anymore.

解决方案

The key to good performance with Java is to write idiomatic code, rather than trying to outwit the JIT compiler. If you write your code to try to influence it to do things in a certain way at the native instruction level, you are more likely to shoot yourself in the foot.

That isn't to say that common principles like locality of reference don't matter. They do, but I would consider the use of arrays and such to be performance-aware, idiomatic code, but not "tricky."

HotSpot and other optimizing runtimes are extremely clever about how they optimize code for specific processors. (For an example, check out this discussion.) If I were an expert machine language programmer, I'd write machine language, not Java. And if I'm not, it would be unwise to think that I could do a better job of optimizing my code than the experts.

Also, even if you do know the best way to implement something for a particular CPU, the beauty of Java is write-once-run-anywhere. Clever tricks to "optimize" Java code tend to make optimization opportunities harder for the JIT to recognize. Straight-forward code that adheres to common idioms is easier for an optimizer to recognize. So even when you get the best Java code for your testbed, that code might perform horribly on a different architecture, or at best, fail to take advantages of enhancements in future JITs.

If you want good performance, keep it simple. Teams of really smart people are working to make it fast.

这篇关于我可以在Java代码中做些什么来优化CPU缓存?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆