我可以从 Dalvik 和 Android 工具链中获得哪些优化? [英] What optimizations can I expect from Dalvik and the Android toolchain?

查看:27
本文介绍了我可以从 Dalvik 和 Android 工具链中获得哪些优化?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在开发一个高性能的 Android 应用程序(一款游戏),虽然我首先尝试编写代码以提高可读性,但我还是喜欢将幕后发生的事情留在脑海中.使用 C++,我已经对编译器将为我做什么和不做什么有了相当好的直觉.我正在尝试为 Java/Android 做同样的事情.

I'm working on a high-performance Android application (a game), and though I try to code for readability first, I like to keep in the back of my mind a picture of what is happening under the hood. With C++, I've developed a fairly good intuition about what the compiler will and won't do for me. I'm trying to do the same for Java/Android.

因此这个问题.我在网上几乎找不到关于这个主题的信息.Java 编译器、Dalvik 转换器 (dx) 和/或 JITter(在 Android 2.2+ 上)是否会执行如下优化?

Hence this question. I could find very little about this topic on the web. Will the Java compiler, Dalvik converter (dx) and/or JITter (on Android 2.2+) perform optimizations like the following?

  • 方法内联.在什么条件下?private 方法总是可以安全地内联;会这样做吗?public final 方法怎么样?其他类的对象的方法?静态 方法?如果对象的运行时类型可以很容易地被编译器推导出来呢?我应该尽可能将方法声明为 final 还是 static?

  • Method inlining. Under what conditions? private methods can always safely be inlined; will this be done? How about public final methods? Methods on objects of other classes? static methods? What if the runtime type of the object can easily be deduced by the compiler? Should I declare methods as final or static wherever possible?

常见的子表达式消除.例如,如果我访问 someObject.someField 两次,查找是否只执行一次?如果是对 getter 的调用怎么办?如果我两次使用某个算术表达式怎么办?会只评估一次吗?如果我使用某个表达式的结果(我知道其值不会改变)作为 for 循环的上限怎么办?

Common subexpression elimination. For example, if I access someObject.someField twice, will the lookup be done only once? What if it's a call to a getter? What if I use some arithmetic expression twice; will it be evaluated only once? What if I use the result of some expression, whose value I know not to change, as the upper bound of a for loop?

数组查找的边界检查.工具链是否会在某些条件下消除这种情况,例如典型的 for 循环?

Bounds checking on array lookups. Will the toolchain eliminate this in certain conditions, like the archetypical for loop?

值内联.对某些 public static final int 的访问是否总是内联的?即使他们在另一个班级?即使它们在另一个包裹中?

Value inlining. Will accesses to some public static final int always be inlined? Even if they're in another class? Even if they're in another package?

分支预测.这是一个多大的问题?在典型的 Android 设备上,分支是否会对性能造成很大的影响?

Branch prediction. How big an issue is this even? Is branching a large performance hit on a typical Android device?

简单的算术.someInt * 2 会被替换为 someInt <<1?

Simple arithmetic. Will someInt * 2 be replaced by someInt << 1?

等等...

推荐答案

这是 Ben,JIT @ Google 的一名工程师.当 Bill 和我开始这个项目时,目标是尽快交付一个工作 JIT,对资源争用(例如内存占用、编译器线程劫持 CPU)的影响最小,以便它可以在低端设备上运行好.因此,我们使用了一个非常原始的基于跟踪的模型.也就是说,传递给 JIT 编译器的编译实体是一个基本块,有时只有一条指令那么短.此类跟踪将在运行时通过称为链接的技术拼接在一起,以便不会经常调用解释器和代码缓存查找.在某种程度上,加速的主要来源是消除了频繁执行的代码路径上重复的解释器解析开销.

This is Ben, one of the engineers working on the JIT @ Google. When Bill and I started on this project, the goal was to deliver a working JIT as soon as possible with minimal impact to resource contention (eg memory footprint, CPU hijacked by the compiler thread) so that it can run on low-end devices as well. Therefore we used a very primitive trace based model. That is, the compilation entity passed to the JIT compiler is a basic block, sometimes as short as a single instruction. Such traces will be stitched together at runtime through a technique called chaining so that the interpreter and code cache lookup won't be invoked often. To some degree the major source of speedup comes from eliminating the repeated interpreter parsing overhead on frequently executed code paths.

也就是说,我们确实通过 Froyo JIT 实现了很多本地优化:

That said, we do have quite a few local optimizations implemented with the Froyo JIT:

  • 寄存器分配(8 个寄存器用于 v5te 目标,因为 JIT 生成 Thumb 代码/16 个寄存器用于 v7)
  • 调度(例如 Dalvik 寄存器的冗余 ld/st 消除、负载提升、存储下沉)
  • 冗余空校验消除(如果可以在基本块中找到此类冗余).
  • 简单计数循环的循环形成和优化(即循环体中没有侧出口).对于此类循环,优化了基于扩展归纳变量的数组访问,以便仅在循环序言中执行空值和范围检查.
  • 每个虚拟调用站点一个条目内联缓存,在运行时进行动态修补.
  • 窥孔优化,例如降低 mul/div 文字操作数的功耗.

在 Gingerbread 中,我们为 getter/setter 添加了简单的内联.由于底层 JIT 前端仍然是简单的基于跟踪的,如果被调用者在那里有分支,它将不会被内联.但是实现了内联缓存机制,因此可以毫无问题地内联虚拟 getter/setter.

In Gingerbread we added simple inlining for getters/setters. Since the underlying JIT frontend is still simple trace based, if the callee has branches in there it won't be inlined. But the inline cache mechanism is implemented so that virtual getters/setters can be inlined without problems.

我们目前正在努力将编译范围扩大到一个简单的跟踪之外,以便编译器有一个更大的窗口来进行代码分析和优化.敬请关注.

We are currently working on enlarging the compilation scope beyond a simple trace so that the compiler has a larger window for code analysis and optimization. Stay tuned.

这篇关于我可以从 Dalvik 和 Android 工具链中获得哪些优化?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆