安卓:为什么本地code所以比Java code更快 [英] Android: why is native code so much faster than Java code

查看:125
本文介绍了安卓:为什么本地code所以比Java code更快的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在以下SO问题:<一href=\"https://stackoverflow.com/questions/2067955/fast-bitmap-blur-for-android-sdk\">https://stackoverflow.com/questions/2067955/fast-bitmap-blur-for-android-sdk @zeh声称一个java模糊算法到C端口上运行快40倍。

In the following SO question: https://stackoverflow.com/questions/2067955/fast-bitmap-blur-for-android-sdk @zeh claims a port of a java blur algorithm to C runs 40 times faster.

鉴于code的体积仅包括计算,所有的分配都只是做了一次性的实际算法的数字运算前 - 任何人都可以解释为什么这code运行40倍的速度?如果没有的Dalvik JIT翻译字节code和显着的间隙降低到本地编译code的速度?

Given that the bulk of the code includes only calculations, and all allocations are only done "one time" before the actual algorithm number crunching - can anyone explain why this code runs 40 times faster? Shouldn't the Dalvik JIT translate the bytecode and dramatically reduce the gap to native compiled code speed?

请注意:我还没有证实的X40性能提升自己的这种算法,但所有严肃的图像处理算法,我为Android遭遇,使用NDK - 所以这个支持NDK code将运行得更快的概念。

Note: I have not confirmed the x40 performance gain myself for this algorithm, but all serious image manipulation algorithm I encounter for Android, are using the NDK - so this supports the notion that NDK code will run much faster.

推荐答案

对于数组中的数据进行操作的算法,有两件事情是像Java语言之间显著改变性能,和C:

For algorithms that operate over arrays of data, there are two things that significantly change performance between a language like Java, and C:


  • 数组边界检查 - Java将检查每一个访问,BMAP [我],并确认我是数组边界之内。如果code尝试访问出界,你会得到一个有用的例外。 C&放大器; C ++不检查任何东西,只相信自己的code。最好的情况下响应一个出界访问页面错误。更可能的结果是意外的行为。

  • array bound checking - Java will check every access, bmap[i], and confirm i is within the array bounds. If the code tries to access out of bounds, you will get a useful exception. C & C++ do not check anything and just trust your code. The best case response to an out of bounds access is a page fault. A more likely result is "unexpected behavior".

指针 - 您可以通过显著使用指针减少操作。

pointers - You can significantly reduce the operations by using pointers.

以一个共同的过滤器这个无辜的例子(类似于模糊,但1D):

Take this innocent example of a common filter (similar to blur, but 1D):

for{i=0; i<ndata-ncoef; ++i) {
  z[i] = 0;
  for{k=0; k<ncoef; ++k) {
    z[i] += c[k] * d[i+k];
  }
}

当你访问一个数组元素,系数[K]是:

When you access an array element, coef[k] is:


  • 阵列系数的加载地址到寄存器

  • 负载值K到寄存器

  • 总结他们

  • 走在这个地址获取内存

因为你知道索引是连续的数组访问每个人都可以得到改善。编译器,也不了JIT,可以知道的索引顺序所以不能充分优化(虽然他们继续尝试)。

Every one of those array accesses can be improved because you know that the indexes are sequential. The compiler, nor the JIT, can know that the indexes are sequential so cannot optimize fully (although they keep trying).

在C ++中,你会写code更是这样的:

In C++, you would write code more like this:

int d[10000];
int z[10000];
int coef[10];
int* zptr;
int* dptr;
int* cptr;
dptr = &(d[0]); // Just being overly explicit here, more likely you would dptr = d;
zptr = &(z[0]); // or zptr = z;
for{i=0; i<(ndata-ncoef); ++i) {
  *zptr = 0; 
  *cptr = coef;
  *dptr = d + i;
  for{k=0; k<ncoef; ++k) {
    *zptr += *cptr * *dptr;
    cptr++;
    dptr++;
  }
  zptr++;
}

当你第一次做这样的事情(和得到它正确的成功),你会惊奇地发现它快多少都可以。所有提取的索引和总结了索引和基地址的数组地址计算被替换为增量指令

When you first do something like this (and succeed in getting it correct) you will be surprised how much faster it can be. All the array address calculations of fetching the index and summing the index and base address are replaced with an increment instruction.

有关二维数组操作,如图像,一个无辜的code数据的模糊[R,C]涉及到两个值取,乘法和总和。因此,与二维数组指针的好处,让您免去乘法运算。

For 2D array operations such as blur on an image, an innocent code data[r,c] involves two value fetches, a multiply and a sum. So with 2D arrays the benefits of pointers allows you to remove multiply operations.

所以语言允许CPU必须执行的操作真实的还原。成本是C ++的code是可怕的阅读和调试。在指针和缓冲区溢出错误的食物黑客。但是,当涉及到原始的数字磨算法,速度的提升实在是太诱人,不容忽视。

So the language allows real reduction in the operations the CPU must perform. The cost is that the C++ code is horrendous to read and debug. Errors in pointers and buffer overflows are food for hackers. But when it comes to raw number grinding algorithms, the speed improvement is too tempting to ignore.

这篇关于安卓:为什么本地code所以比Java code更快的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆