理解的Dalvik code分解? [英] Understanding disassembly of Dalvik code?

查看:148
本文介绍了理解的Dalvik code分解?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我跟 smali和一个小的Hello World的Andr​​oid baksmali 玩弄应用程序,我已经写了。我的源$ C ​​$ c是:

I am playing around with smali and baksmali on a small Hello World Android application I have written. My source code is:

package com.hello;

import android.app.Activity;
import android.os.Bundle;

public class Main extends Activity {
    /** Called when the activity is first created. */
    @Override
    public void onCreate(Bundle savedInstanceState) {
        super.onCreate(savedInstanceState);
        setContentView(R.layout.main);
    }
}

这是当时分解为:

which was then disassembled to:

.class public Lcom/hello/Main;
.super Landroid/app/Activity;
.source "Main.java"


# direct methods
.method public constructor <init>()V
    .locals 0

    .prologue
    .line 6
    invoke-direct {p0}, Landroid/app/Activity;-><init>()V

    return-void
.end method


# virtual methods
.method public onCreate(Landroid/os/Bundle;)V
    .locals 1
    .parameter "savedInstanceState"

    .prologue
    .line 10
    invoke-super {p0, p1}, Landroid/app/Activity;->onCreate(Landroid/os/Bundle;)V

    .line 11
    const/high16 v0, 0x7f03

    invoke-virtual {p0, v0}, Lcom/hello/Main;->setContentView(I)V

    .line 12
    return-void
.end method

据我所知,这是某种形式的中间再presentation但我不知道它是什么。据我所知,必须有关于如何理解这种重新presentation但我无法弄清楚如何寻找它的一些规范。因此,给予APK文件,能有人在外行来解释如何在的Dalvik运$ C 用于在这个重新presentation到达$ C规范?我目前的理解是这样的:

I understand that this is some kind of Intermediate Representation but am not sure what it is. As I understand there must be some specification on how to understand this representation but am unable to figure out how to search for it. So given an apk file, can someone explain in layman terms on how the Dalvik opcode specification is used to arrive at this representation? My current understanding is this:

  • 在给定的APK,我可以提取 的Andr​​oidManifest.xml在一个二进制XML 格式化并使用工具,如 axml2xml.pl 以得到一个文本 版本清单的不是 完成或我可以使用 apktool ,以获得更具可读性 形成。但我现在还不能确定是什么 他们使用的是向说明书中 转换二进制XML转换为文本。
  • 的 反编译器的不知何故利用 在Dalvil运code规格 读取DEX文件,并将其转换 到上述重presentation。
  • Given an APK, I could extract the AndroidManifest.xml in a Binary XML format and use a tool such as axml2xml.pl to get a "textual" version of the manifest that is not complete OR I could use the apktool to get a more readable form. But I am still not sure what specification they are using to convert the binary XML into text.
  • The disassemblers are somehow utilizing the Dalvil opcode specification to read the dex files and convert it into the above representation.

任何信息(也许还包括一些简单的例子)以上两个步骤,将帮助我在一个很好的方式得到的概念权利。

Any information (perhaps with some simple examples) on the above two steps would help me in a great way in getting the concepts right.

更新1(贴克里斯的答复后):

所以基本上,我会做下面以在Dalvik字节code到:

So essentially, I would do the following to arrive at the Dalvik bytecode:

  • 在以一个apk文件并将其解压得到classes.dex文件。
  • 然后反汇编器读取classes.dex文件,并确定所有的类present的APK。你能向我提供关于如何做到这一点的情况吗?是否解析十六进制模式文件,查找在Dalvik规范,然后适当地解决?抑或是别的东西怎么回事?举例来说,当我用hexdump都在classes.dex,它给了我这样的:

  • Take an apk and extract it to get the classes.dex files.
  • Then the disassembler reads the classes.dex file and determines all the classes present in the apk. Can you provide me some information on how this is done? Does it parse the file in hex mode and lookup the Dalvik specification and then resolve appropriately? Or is something else happening? For instance, when I used hexdump on classes.dex, it gave me something like this:

64 65 78 0A 30 33 ...

64 65 78 0a 30 33 ...

难道这些现在用于运算code查找?

Are these now used for Opcode lookups?

  • 假设该工具能够将接收的字节code到单独的类,它就会继续从classes.dex文件扫描六角codeS,并使用Davlik规范输出相应的运算$从表C $ C的名字?

其实,总之,我想知道怎么都这样神奇就完成了。因此,举例来说,如果我要学写这个工具,什么是高层次的路线图,我应该遵循?

Actually, in short, I am interested in knowing how all this "magic" is done. So for instance, if I were to learn to write this tool, what is the high-level roadmap I should follow?

推荐答案

你看什么是davlik字节code。 Java的code是由DX工具转换成Dalvik的字节code。该清单是我会在一分钟内一个单独的问题。实际上,当你编译你的Andr​​oid应用程序,DX的工具将你的Java code成字节code。使用256的Dalvik(即javac的将Java为Java字节code为标准JVM的应用程序以同样的方式)运codeS。

What you're looking at is the davlik bytecode. Java code is translated to Dalvik bytecode by the dx tool. The manifest is a separate issue which I'll get to in a minute. Effectively, when you compile your Android application, the dx tool converts your Java code into bytecode (the same way that javac converts Java to Java bytecode for a standard JVM application) using the 256 dalvik opcodes.

例如,调用超级是一个运算code,指示DVM(Dalvik虚拟机)上的超类调用方法。同样,调用接口指示DVM调用的接口方法。

For example, invoke-super is an opcode that instructs the dvm (dalvik virtual machine) to invoke a method on the super class. Similarly, invoke-interface instructs the dvm to invoke an interface method.

所以,你可以看到,

super.onCreate(savedInstanceState);

转化为

invoke-super {p0, p1}, Landroid/app/Activity;->onCreate(Landroid/os/Bundle;)

在这种情况下,调用超级有两个参数,在 {P0,P1 组和 Landroid / APP /活动; - &GT;的onCreate(Landroid / OS /包;)参数,是它用来查找并在必要时解决方法的方法规范

In this case, invoke-super takes two parameters, the {p0,p1 group and the Landroid/app/Activity;->onCreate(Landroid/os/Bundle;) parameter which is the method specification which it uses to look up and resolve the method if necessary.

再有就是在构造区域的调用直接电话。

Then there's the invoke-direct call in the constructor area.

invoke-direct {p0}, Landroid/app/Activity;-><init>()V

每个类都有一个的init 方法用来初始化类的数据成员,也被称为构造函数。当你构建一个类时,虚拟机也必须调用父类的构造函数。这就解释了为什么构造类调用活动的构造。

Every class has an init method that is used to initialize the class's data members, also known as the constructor. When you construct a class, the virtual machine must also call the constructor of the superclass. This explains why the constructor for your class calls the Activity constructor.

至于清单,会发生什么(这是所有在Dalvik的规格,如果你看看源$ C ​​$ c)为编译器(即生成APK文件)转换清单到多个COM pressed格式(二进制XML),为了节省空间的目的。该清单没有任何与您发布的code,它更指示如何处理该应用程序是一个整体的问候活动对DVM, 服务等什么你已经张贴是实际被执行。

With regards to the manifest, what happens (this is all in the Dalvik specs if you check out the source code) is that the compiler (that generates the apk file) converts the manifest to a more compressed format (binary xml) for the purposes of saving space. The manifest doesn't have anything to do with the code you posted, it more instructs the dvm on how to process the application is a whole with regards to Activities, Services, etc. What you've posted is what actually gets executed.

这是一个高层次的回答你的问题。如果您需要更多,让我知道,我会尽我所能。

That's a high-level answer to your question. If you need more, let me know and I'll do my best.

修改您是基本正确的。在反编译器读取二进制数据从DEX文件中的字节流。它有什么样的格式应该是了解并能拉出信息,如常量,类等等至于运codeS,这也正是它的作用。据了解每一个运算code中的字节值(或它是如何psented在DEX文件重新$ P $),并能够将其转换成可读的字符串。如果你打算从了解编译器的一般基本实现这一点,顺便说一句,我将开始与一个DEX文件的结构的深刻理解。从那里,你就需要构建一个与人类可读的字符串相匹配运算code值表。有了这些信息以及有关字符串常量一些额外的信息,等等。你可以构建编译类的文本文件重新presentation。这是否有意义?

Edit You're basically right. The decompiler reads the binary data as a byte stream from the dex file. It has an understanding of what the format should be and is able to pull out information like constants, classes, etc. With regards to the opcodes, that's exactly what it does. It understand what the byte value for each opcode is (or how it's represented in the dex file) and is able to convert that into a human-readable string. If you were going to implement this, aside from understanding the general basics of compilers, I would start with a deep understanding of the structure of a dex file. From there, you would need to construct a table that matches opcode values with the human-readable string. With that information and some additional information regarding string constants, etc. you could construct a text-file representation of the compiled class. Does that make sense?

这篇关于理解的Dalvik code分解?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆