OpenJDK实现System.arraycopy [英] OpenJDK implementation of System.arraycopy

查看:171
本文介绍了OpenJDK实现System.arraycopy的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

下面是一个关于JVM实现基于char []的字符串创建方式的问题,我提到当char []被复制到新字符串的内部时不会发生迭代,因为System.arraycopy获取最终调用,它使用诸如memcpy之类的函数复制所需的内存,这些函数在本地,实现相关级别(原始问题)。

Following a question related to the way the JVM implements creation of Strings based on char[], I have mentioned that no iteration takes place when the char[] gets copied to the interior of the new string, since System.arraycopy gets called eventually, which copies the desired memory using a function such as memcpy at a native, implementation-dependent level (the original question).

我想检查自己,所以我下载了Openjdk 7源代码并开始浏览它。
我在OpenJDK C ++源代码中找到System.arraycopy的实现,在 openjdx / hotspot / src / share / vm / oops / objArrayKlass.cpp

I wanted to check that for myself, so I downloaded the Openjdk 7 source code and started browsing it. I found the implementation of System.arraycopy in the OpenJDK C++ source code, in openjdx/hotspot/src/share/vm/oops/objArrayKlass.cpp:

if (stype == bound || Klass::cast(stype)->is_subtype_of(bound)) {
  // elements are guaranteed to be subtypes, so no check necessary
  bs->write_ref_array_pre(dst, length);
  Copy::conjoint_oops_atomic(src, dst, length);
} else {
  // slow case: need individual subtype checks

元素不需要类型检查(例如,原始数据类型数组),Copy :: conjoin_oops_atomic被调用。

If the elements need no type checks (that's the case with, for instance, primitive data type arrays), Copy::conjoin_oops_atomic gets called.

Copy :: conjoint_oops_atomic 函数驻留在'copy.hpp'中:

The Copy::conjoint_oops_atomic function resides in 'copy.hpp':

// overloaded for UseCompressedOops
static void conjoint_oops_atomic(narrowOop* from, narrowOop* to, size_t count) {
  assert(sizeof(narrowOop) == sizeof(jint), "this cast is wrong");
  assert_params_ok(from, to, LogBytesPerInt);
  pd_conjoint_jints_atomic((jint*)from, (jint*)to, count);
}



现在我们是平台相关的,因为复制操作有不同的实现,基于OS /架构。我将以Windows为例。 openjdk\hotspot\src\os_cpu\windows_x86\vm\copy_windows_x86.inline.hpp

static void pd_conjoint_oops_atomic(oop* from, oop* to, size_t count) {
// Do better than this: inline memmove body  NEEDS CLEANUP
if (from > to) {
  while (count-- > 0) {
    // Copy forwards
    *to++ = *from++;
  }
} else {
  from += count - 1;
  to   += count - 1;
  while (count-- > 0) {
    // Copy backwards
    *to-- = *from--;
  }
 }
}

惊喜,它循环遍历元素(oop值),复制它们一个接一个(似乎)。

And... to my surprise, it iterates through the elements (the oop values), copying them one by one (seemingly). Can someone explain why the copy is done, even at the native level, by iterating through the elements in the array?

推荐答案

因为有人可以解释为什么复制完成,甚至在原生层面呢? jint 最接近地映射到 int ,其最接近地映射到旧的硬件架构 WORD ,其基本上与数据总线的宽度相同。

Because the jint most closely maps to int which most closely maps to the old hardware architecture WORD, which is basically the same size as the width of the data bus.

今天的内存架构和cpu处理设计为尝试处理在高速缓存未命中的情况下,并且存储器位置倾向于预取块。你看的代码在性能上不如你想象的坏。硬件是更聪明的,如果你没有实际剖析,你的聪明的抓取例程可能实际上什么也不添加(或甚至减慢处理)。

The memory architectures and cpu processing of today are designed to attempt processing even in the event of a cache miss, and memory locations tend to pre-fetch blocks. The code that you are looking at isn't quite as "bad" in performance as you might think. The hardware is smarter, and if you don't actually profile, your "smart" fetching routines might actually add nothing (or even slow down processing).

介绍到硬件架构,你必须介绍到简单的。现代的做得更多,所以你不能假设看起来低效的代码实际上是低效的。例如,当执行存储器查找以评估if语句上的条件时,通常在查找发生时执行if语句的两个分支,并且在数据变得可用于评估之后丢弃处理的假分支条件。如果你想要有效率,你必须配置文件,然后对配置文件数据采取行动。

When you are introduced to hardware architectures, you must be introduced to simple ones. Modern ones do a lot more, so you can't assume that code that looks inefficient is actually inefficient. For example, when a memory lookup is done to evaluate the condition on an if statement, often both branches of the if statement are executed while the lookup is occurring, and the "false" branch of processing is discarded after the data becomes available to evaluate the condition. If you want to be efficient, you must profile and then act on the profiled data.

查看JVM操作码部分的分支。你会看到它是(或者可能,只是)一个ifdef宏奇怪支持(一次)三种不同的方式跳转到处理操作码的代码。这是因为三种不同的方法在不同的Windows,Linux和Solaris体系结构上实际上产生了有意义的性能差异。

Look at the branch on JVM opcode section. You'll see it is (or perhaps, just was) an ifdef macro oddity to support (at one time) three different ways of jumping to the code that handled the opcode. That was because the three different ways actually made a meaningful performance difference on the different Windows, Linux, and Solaris architectures.

也许他们可能包括MMX例程,没有告诉我SUN并不认为现代硬件的性能提升足以令人担心。

Perhaps they could have included MMX routines, but that they didn't tells me that SUN didn't think it was enough of a performance gain on modern hardware to worry about it.

这篇关于OpenJDK实现System.arraycopy的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆