如何从目标文件读取Mach-O标头? [英] How to read Mach-O header from object file?

查看:123
本文介绍了如何从目标文件读取Mach-O标头?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

过去的几天我一直在尝试组装,现在了解了组装和机器代码之间的关系(通过OSX在NASM上通过x86使用x86,请阅读

I have spent the past few days experimenting with assembly, and now understand the relationship between assembly and machine code (using x86 via NASM on OSX, reading the Intel docs).

现在,我正在尝试了解链接器如何工作的详细信息,尤其是要了解从Mach-O标头开始的Mach-O目标文件的结构.

Now I am trying to understand the details of how the linker works, and specifically want to understand the structure of Mach-O object files, starting with the Mach-O headers.

我的问题是,您能否映射下面的Mach-O标头如何映射到otool命令输出(显示标头,但格式不同)?

My question is, can you map out how the Mach-O headers below map to the otool command output (which displays the headers, but they are in a different format)?

此问题的某些原因包括:

Some reasons for this question include:

  • 这将帮助我了解"Mach-O标头结构"中的文档在实际目标文件中的外观.
  • 这将简化理解的路径,因此我自己和其他新手不必花很多时间或几天去思考他们的意思是 this 还是 this "打字的东西.没有以前的经验,很难将Mach-O的一般文档从精神上转换为现实世界中的实际目标文件.
  • It will help me see how the documents on the "structure of Mach-O headers" look in real-world object files.
  • It will simplify the path to understanding, so myself and other newcomers don't have to spend many hours or days wondering "do they mean this, or this" type thing. It's hard without previous experience to mentally translate the general Mach-O documentation into an actual object file in the real world.

下面,我展示了尝试从真实目标文件中解码Mach-O标头的示例和过程.在下面的描述中,我会尝试提示所有出现的小问题/微妙问题.希望这会给新手带来一种困惑.

Below I show the example and process I went through to try to decode the Mach-O header from a real object file. Throughout the descriptions below, I try to show hints of all the little/subtle questions that arise. Hopefully this will provide a sense of how this can be very confusing to a newcomer.

从名为example.c的基本C文件开始:

Starting with a basic C file called example.c:

#include <stdio.h>

int
main() {
  printf("hello world");
  return 0;
}

gcc example.c -o example.out进行编译,从而得到:

Compile it with gcc example.c -o example.out, which gives:

cffa edfe 0700 0001 0300 0080 0200 0000
1000 0000 1005 0000 8500 2000 0000 0000
1900 0000 4800 0000 5f5f 5041 4745 5a45
524f 0000 0000 0000 0000 0000 0000 0000
0000 0000 0100 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 0000 0000 1900 0000 2802 0000
5f5f 5445 5854 0000 0000 0000 0000 0000
0000 0000 0100 0000 0010 0000 0000 0000
0000 0000 0000 0000 0010 0000 0000 0000
0700 0000 0500 0000 0600 0000 0000 0000
5f5f 7465 7874 0000 0000 0000 0000 0000
5f5f 5445 5854 0000 0000 0000 0000 0000
400f 0000 0100 0000 2d00 0000 0000 0000
400f 0000 0400 0000 0000 0000 0000 0000
0004 0080 0000 0000 0000 0000 0000 0000
5f5f 7374 7562 7300 0000 0000 0000 0000
5f5f 5445 5854 0000 0000 0000 0000 0000
6e0f 0000 0100 0000 0600 0000 0000 0000
6e0f 0000 0100 0000 0000 0000 0000 0000
0804 0080 0000 0000 0600 0000 0000 0000
5f5f 7374 7562 5f68 656c 7065 7200 0000
... 531 total lines of this

运行otool -h example.out,它会打印:

example.out:
Mach header
      magic cputype cpusubtype  caps    filetype ncmds sizeofcmds      flags
 0xfeedfacf 16777223          3  0x80          2    16       1296 0x00200085


研究

要了解Mach-O文件格式,我发现以下资源很有帮助:


Research

To understand the Mach-O file format, I found these resources helpful:

  • https://developer.apple.com/library/mac/documentation/DeveloperTools/Conceptual/MachORuntime/index.html#//apple_ref/doc/uid/TP40000895
  • https://developer.apple.com/library/mac/documentation/DeveloperTools/Conceptual/MachORuntime/index.html
  • https://www.mikeash.com/pyblog/friday-qa-2012-11-30-lets-build-a-mach-o-executable.html
  • http://www.opensource.apple.com/source/xnu/xnu-1456.1.26/EXTERNAL_HEADERS/mach-o/loader.h
  • http://www.opensource.apple.com/source/dtrace/dtrace-78/head/arch.h
  • http://www.opensource.apple.com/source/xnu/xnu-792.13.8/osfmk/mach/machine.h

opensource.apple.com中的最后3个包含所有常量,例如:

Those last 3 from opensource.apple.com contain all the constants, such as these:

#define MH_MAGIC_64 0xfeedfacf /* the 64-bit mach magic number */
#define MH_CIGAM_64 0xcffaedfe /* NXSwapInt(MH_MAGIC_64) */
...
#define CPU_TYPE_MC680x0  ((cpu_type_t) 6)
#define CPU_TYPE_X86    ((cpu_type_t) 7)
#define CPU_TYPE_I386   CPU_TYPE_X86    /* compatibility */
#define CPU_TYPE_X86_64   (CPU_TYPE_X86 | CPU_ARCH_ABI64)

Mach-O标头的结构显示为:

The structure of the Mach-O header is shown as:

struct mach_header_64 {
  uint32_t  magic;    /* mach magic number identifier */
  cpu_type_t  cputype;  /* cpu specifier */
  cpu_subtype_t cpusubtype; /* machine specifier */
  uint32_t  filetype; /* type of file */
  uint32_t  ncmds;    /* number of load commands */
  uint32_t  sizeofcmds; /* the size of all the load commands */
  uint32_t  flags;    /* flags */
  uint32_t  reserved; /* reserved */
};

鉴于此信息,目标是在example.out目标文件中找到每一个Mach-O标头.

Given this information, the goal was to find each of those pieces of the Mach-O header in the example.out object file.

基于该示例和研究,我能够识别出Mach-O标头的第一部分,即魔数".太酷了.

Given that example and research, I was able to identify the first part of the Mach-O header, the "magic number". That was cool.

但这不是一个简单的过程.这是必须找出来解决的信息.

But it wasn't a straightforward process. Here are the pieces of information that had to be collected to figure that out.

  • otool输出的第一列显示魔术"为.
  • mach-o/loader.h .由于我使用的是64位而非32位体系结构,因此使用MH_MAGIC_64(0xfeedfacf)和MH_CIGAM_64(0xcffaedfe).
  • 浏览example.out文件,前8个十六进制代码为cffa edfe,与MH_CIGAM_64相匹配!它采用的格式不同,有点麻烦,但它们是2种不同的十六进制格式,足够接近以显示连接.它们也相反.
  • The first column of the otool output shows "magic" to be 0xfeedfacf.
  • The Apple Mach-O docs say that the header should be either MH_MAGIC or MH_CIGAM ("magic" in reverse). So found those through google in mach-o/loader.h. Since I am using 64-bit architecture and not 32-bit, went with MH_MAGIC_64 (0xfeedfacf) and MH_CIGAM_64 (0xcffaedfe).
  • Looked through example.out file and the first 8 hex codes were cffa edfe, which matches MH_CIGAM_64! It's in a different format which throws you off a little bit, but they are 2 different hex formats that are close enough to see the connection. They are also reversed.

这里有3个数字,足以弄清楚什么是魔术数字:

Here are the 3 numbers, which were enough to sort of figure out what the magic number is:

0xcffaedfe // value from MH_CIGAM_64
0xfeedfacf // value from otool
cffa edfe  // value in example.out

这真令人兴奋!仍然不确定我是否能就这些数字得出正确的结论,但希望如此.

So that's exciting! Still not totally sure if I am coming to the right conclusion about these numbers, but hope so.

现在,它开始变得令人困惑.以下是需要拼凑起来的内容,几乎可以理解它们,但这是我到目前为止所难忘的地方:

Now it starts to get confusing. Here are the pieces that needed to be put together to almost make sense of it, but this is where I'm stuck so far:

  • otool显示16777223. 这个苹果stackexchange问​​题提供了一些有关如何理解这一点的提示./li>
  • CPU_TYPE_X86_64" rel ="nofollow noreferrer"> mach/machine.h ,并且必须进行多次计算才能确定其值.
  • otool shows 16777223. This apple stackexchange question gave some hints on how to understand this.
  • Found CPU_TYPE_X86_64 in mach/machine.h, and had to do several calculations to figure out it's value.

以下是相关常量,可用来计算CPU_TYPE_X86_64的值:

Here are the relevant constants to do calculate the value of CPU_TYPE_X86_64:

#define CPU_ARCH_ABI64  0x01000000      /* 64 bit ABI */
#define CPU_TYPE_X86        ((cpu_type_t) 7)
#define CPU_TYPE_I386       CPU_TYPE_X86        /* compatibility */
#define CPU_TYPE_X86_64     (CPU_TYPE_X86 | CPU_ARCH_ABI64)

所以基本上:

CPU_TYPE_X86_64 = 7 BITWISEOR 0x01000000 // 16777223

该数字16777223otool显示的数字匹配,很好!

That number 16777223 matches what is shown by otool, nice!

接下来,尝试在example.out中找到该数字,但是它不存在,因为它是一个十进制数字.我只是在JavaScript中将其转换为十六进制,其中

Next, tried to find that number in the example.out, but it doesn't exist because that is a decimal number. I just converted this to hex in JavaScript, where

> (16777223).toString(16)
'1000007'

因此不确定这是否是生成十六进制数字的正确方法,尤其是与Mach-O对象文件中的十六进制数字匹配的方法. 1000007也是只有7个数字,所以不知道您是否应该填充"它或其他内容.

So not sure if this is the correct way to generate a hex number, especially one that will match the hex numbers in a Mach-O object file. 1000007 is only 7 numbers too, so don't know if you are supposed to "pad" it or something.

无论如何,您会在魔术数字之后看到这个数字example.out:

Anyways, you see this number example.out, right after the magic number:

0700 0001

嗯,它们似乎与有些有关:

0700 0001
1000007

1000007的末尾似乎添加了0,并且它被颠倒了.

It looks like there was a 0 added to the end of 1000007, and that it was reversed.

在这一点上,我想问一个问题,已经花费了几个小时来解决这一问题. Mach-O标头的结构如何映射到实际的Mach-O目标文件?您能否显示标题的每个部分如何显示在上面的example.out文件中,并简要说明原因?

At this point I wanted to ask the question, already spent a few hours to get to this point. How does the structure of the Mach-O header map to the actual Mach-O object file? Can you show how each part of the header shows up in the example.out file above, with a brief explanation why?

推荐答案

endianness .在这种情况下,标头以平台的本机格式存储.兼容Intel的平台是低端系统,这意味着多字节值的最低有效字节在字节序列中排在首位.

Part of what's confusing you is endianness. In this case, the header is stored in the native format for the platform. Intel-compatible platforms are little-endian systems, meaning the least-significant byte of a multi-byte value is first in the byte sequence.

因此,当将字节序列07 00 00 01解释为32位小尾数时,它对应于0x01000007.

So, the byte sequence 07 00 00 01, when interpreted as a little-endian 32-bit value, corresponds to 0x01000007.

解释结构的另一件事是每个字段的大小.所有uint32_t字段都非常简单.它们是32位无符号整数.

The other thing you need to know to interpret the structure is the size of each field. All of the uint32_t fields are pretty straightforward. They are 32-bit unsigned integers.

cpu_type_tcpu_subtype_t均在您链接的machine.h中定义为等同于integer_t. integer_t在/usr/include/mach/i386/vm_types.h中定义为与int等效. OS X是LP64平台,这意味着long和指针对体系结构(32位和64位)敏感,而int则不然.始终是32位.

Both cpu_type_t and cpu_subtype_t are defined in machine.h that you linked to be equivalent to integer_t. integer_t is defined to be equivalent to int in /usr/include/mach/i386/vm_types.h. OS X is an LP64 platform, which means that longs and pointers are sensitive to the architecture (32- vs. 64-bit), but int is not. It's always 32-bit.

因此,所有字段的大小均为32位或4个字节.由于有8个字段,所以总共有32个字节.

So, all of the fields are 32 bits or 4 bytes in size. Since there are 8 fields, that's a total of 32 bytes.

从原始的十六进制转储中,这是与标题对应的部分:

From your original hexdump, here's the part which corresponds to the header:

cffa edfe 0700 0001 0300 0080 0200 0000
1000 0000 1005 0000 8500 2000 0000 0000

按字段划分:

struct mach_header_64 {
  uint32_t  magic;           cf fa ed fe -> 0xfeedfacf
  cpu_type_t  cputype;       07 00 00 01 -> 0x01000007
  cpu_subtype_t cpusubtype;  03 00 00 80 -> 0x80000003
  uint32_t  filetype;        02 00 00 00 -> 0x00000002
  uint32_t  ncmds;           10 00 00 00 -> 0x00000010
  uint32_t  sizeofcmds;      10 05 00 00 -> 0x00000510
  uint32_t  flags;           85 00 20 00 -> 0x00200085
  uint32_t  reserved;        00 00 00 00 -> 0x00000000
};

这篇关于如何从目标文件读取Mach-O标头?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆