对于小程序,链接后的最小可执行文件大小现在比 2 年前大 10 倍? [英] Minimal executable size now 10x larger after linking than 2 years ago, for tiny programs?

查看:16
本文介绍了对于小程序,链接后的最小可执行文件大小现在比 2 年前大 10 倍?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

对于大学课程,我喜欢比较使用 gcc/clang 与汇编编写和编译的功能相似程序的代码大小.在重新评估如何进一步缩小某些可执行文件的大小的过程中,当我 2 年前组装/链接的完全相同的汇编代码在重新构建后现在已经增长了 10 倍时,我简直不敢相信自己的眼睛(这适用于多个程序,不仅是 helloworld):

For a university course, I like to compare code-sizes of functionally similar programs if written and compiled using gcc/clang versus assembly. In the process of re-evaluating how to further shrink the size of some executables, I couldn't trust my eyes when the very same assembly code I assembled/linked 2 years ago now has grown >10x in size after building it again (which true for multiple programs, not only helloworld):

$ make
as -32 -o helloworld-asm-2020.o helloworld-asm-2020.s
ld -melf_i386 -o helloworld-asm-2020 helloworld-asm-2020.o

$ ls -l
-rwxr-xr-x 1 xxx users  708 Jul 18  2018 helloworld-asm-2018*
-rwxr-xr-x 1 xxx users 8704 Nov 25 15:00 helloworld-asm-2020*
-rwxr-xr-x 1 xxx users 4724 Nov 25 15:00 helloworld-asm-2020-n*
-rwxr-xr-x 1 xxx users 4228 Nov 25 15:00 helloworld-asm-2020-n-sstripped*
-rwxr-xr-x 1 xxx users  604 Nov 25 15:00 helloworld-asm-2020.o*
-rw-r--r-- 1 xxx users  498 Nov 25 14:44 helloworld-asm-2020.s

汇编代码为:

.code32
.section .data
msg: .ascii "Hello, world!
"
         len = . - msg

.section .text
.globl _start

_start:
        movl $len, %edx   # EDX = message length
        movl $msg, %ecx   # ECX = address of message
        movl $1, %ebx     # EBX = file descriptor (1 = stdout)
        movl $4, %eax     # EAX = syscall number (4 = write)
        int $0x80         # call kernel by interrupt

        # and exit
        movl $0, %ebx     # return code is zero
        movl $1, %eax     # exit syscall number (1 = exit)
        int $0x80         # call kernel again

使用 GNU as 和 GNU ld(始终使用 32 位汇编)编译的同一个 hello world 程序当时是 708 字节,现在已经增长到 8.5K.即使告诉链接器关闭页面对齐(ld -n),它仍然有将近 4.2K.stripping/sstripping 也没有回报.

The same hello world program, compiled using GNU as and GNU ld (always using 32-bit assembly) was 708 bytes then, and has grown to 8.5K now. Even when telling the linker to turn off page alignment (ld -n), it still has almost 4.2K. stripping/sstripping doesn't pay off either.

readelf 告诉我代码中段标题的开始要晚得多(字节 468 与 8464),但我不知道为什么.它运行在与 2018 年相同的架构系统上,Makefile 是相同的,我没有链接任何库(尤其是 libc).由于目标文件仍然很小,我猜关于 ld 的某些内容已经更改,但是是什么以及为什么?

readelf tells me that the start of section headers is much later in the code (byte 468 vs 8464), but I have no idea why. It's running on the same arch system as in 2018, the Makefile is the same and I'm not linking against any libraries (especially not libc). I guess something regarding ld has changed due to the fact that the object file is still quite small, but what and why?

免责声明:我正在 x86-64 机器上构建 32 位可执行文件.

Disclaimer: I'm building 32-bit executables on an x86-64 machine.

我使用的是 GNU binutils(作为 & ld)版本 2.35.1 这是一个 base64 编码的存档,其中包括源代码和两个可执行文件(小的旧的,大的新的):

I'm using GNU binutils (as & ld) version 2.35.1 Here is a base64-encoded archive which includes the source and both executables (small old one, large new one) :

cat << EOF | base64 -d | tar xj
QlpoOTFBWSZTWVaGrEQABBp////xebj/7//Xf+a8RP/v3/rAAEVARARAeEADBAAAoCAI0AQ+NAam
ytMpCGmpDVPU0aNpGmh6Rpo9QAAeoBoADQaNAADQ09IAACSSGUwaJpTNQGE9QZGhoADQPUAA0AAA
AA0aA4AAAABoAAAAA0GgAAAAZAGgAHAAAAANAAAAAGg0AAAADIA0AASJCBIyE8hHpqPVPUPU/VAa
fqn6o0ep6BB6TQaNGj0j1ABobU00yeU9JYiuVVZKYE+dKNa3wls6x81yBpGAN71NoylDUvNryWiW
E4ER8XkfpaJcPb6ND12ULEqkQX3eaBHP70Apa5uFhWNDy+U3Ekj+OLx5MtDHxQHQLfMcgCHrGayE
Dc76F4ZC4rcRkvTW4S2EbJAsbBGbQxSbx5o48zkyk5iPBBhJowtCSwDBsQBc0koYRSO6SgJNL0Bg
EmCoxCDAs5QkEmTGmQUgqZNIoxsmwDmDQe0NIDI0KjQ64leOr1fVk6AaVhjOAJjLrEYkYy4cDbyS
iXSuILWohNh+PA9Izk0YUM4TQQGEYNgn4oEjGmAByO+kzmDIxEC3Txni6E1WdswBJLKYiANdiQ2K
00jU/zpMzuIhjTbgiBqE24dZWBcNBBAAioiEhCQEIfAR8Vir4zNQZFgvKZa67Jckh6EHZWAWuf6Q
kGy1lOtA2h9fsyD/uPPI2kjvoYL+w54IUKBEEYFBIWRNCNpuyY86v3pNiHEB7XyCX5wDjZUSF2tO
w0PVlY2FQNcLQcbZjmMhZdlCGkVHojuICHMMMB5kQQSZRwNJkYTKz6stT/MTWmozDCcj+UjtB9Cf
CUqAqqRlgJdREtMtSO4S4GpJE2I/P8vuO9ckqCM2+iSJCLRWx2Gi8VSR8BIkVX6stqIDmtG8xSVU
kk7BnC5caZXTIynyI0doXiFY1+/Csw2RUQJroC0lCNiIqVVUkTqTRMYqKNVGtCJ5yfo7e3ZpgECk
PYUEihPU0QVgfQ76JA8Eb16KCbSzP3WYiVApqmfDhUk0aVc+jyBJH13uKztUuva8F4YdbpmzomjG
kSJmP+vCFdKkHU384LdRoO0LdN7VJlywJ2xJdM+TMQ0KhMaicvRqfC5pHSu+gVDVjfiss+S00ikI
DeMgatVKKtcjsVDX09XU3SzowLWXXunnFZp/fP3eN9Rj1ubiLc0utMl3CUUkcYsmwbKKrWhaZiLO
u67kMSsW20jVBcZ5tZUKgdRtu0UleWOs1HK2QdMpyKMxTRHWhhHwMnVEsWIUEjIfFEbWhRTRMJXn
oIBSEa2Q0llTBfJV0LEYEQTBTFsDKIxhgqNwZB2dovl/kiW4TLp6aGXxmoIpVeWTEXqg1PnyKwux
caORGyBhTEPV2G7/O3y+KeAL9mUM4Zjl1DsDKyTZy8vgn31EDY08rY+64Z/LO5tcRJHttMYsz0Fh
CRN8LTYJL/I/4u5IpwoSCtDViIA=
EOF

更新:当使用 ld.gold 而不是 ld.bfd(/usr/bin/ld 默认符号链接到)时,可执行文件的大小变得和预期的一样小:

Update: When using ld.gold instead of ld.bfd (to which /usr/bin/ld is symlinked to by default), the executable size becomes as small as expected:

$ cat Makefile 
TARGET=helloworld
all:
    as -32 -o ${TARGET}-asm.o ${TARGET}-asm.s
    ld.bfd -melf_i386 -o ${TARGET}-asm-bfd ${TARGET}-asm.o
    ld.gold -melf_i386 -o ${TARGET}-asm-gold ${TARGET}-asm.o
    rm ${TARGET}-asm.o

$ make -q
$ ls -l
total 68
-rw-r--r-- 1 eso eso   200 Dec  1 13:57 Makefile
-rwxrwxr-x 1 eso eso  8700 Dec  1 13:57 helloworld-asm-bfd
-rwxrwxr-x 1 eso eso   732 Dec  1 13:57 helloworld-asm-gold
-rw-r--r-- 1 eso eso   498 Dec  1 13:44 helloworld-asm.s

也许我之前只是在不知道的情况下使用了 gold.

Maybe I just used gold previously without being aware.

推荐答案

通常不是 10 倍,它是 Jester 所说的几个部分的页面对齐,每次更改 ld 的默认链接器出于安全原因的脚本:

It's not 10x in general, it's page-alignment of a couple sections as Jester says, per changes to ld's default linker script for security reasons:

  • 第一个更改:确保 .data 中的数据不存在于 .text 的任何映射中,因此没有任何静态数据可用用于可执行页面中的 ROP/Spectre 小工具.(在较旧的 ld 中,这意味着程序头将同一个磁盘块映射两次,也映射到实际 .data 部分的 RW-without-exec 段中.可执行映射仍然是只读的.)

  • First change: Making sure data from .data isn't present in any of the mapping of .text, so none of that static data is available for ROP / Spectre gadgets in an executable page. (In older ld, that meant the program-headers mapped the same disk-block twice, also into a RW-without-exec segment for the actual .data section. The executable mapping was still read-only.)

最近的变化:将 .rodata.text 分隔成单独的段,同样,静态数据不会映射到可执行页面.以前,const char code[]= {...} 可以转换为函数指针并调用,不需要 mprotect 或 gcc -z execstack 或其他技巧,如果你想测试shellcode 那样.(单独的 Linux 内核更改使 -z execstack 仅适用于实际堆栈,而不适用于 READ_IMPLIES_EXEC.)

More recent change: Separate .rodata from .text into separate segments, again so static data isn't mapped into an executable page. Previously, const char code[]= {...} could be cast to a function pointer and called, without needing mprotect or gcc -z execstack or other tricks, if you wanted to test shellcode that way. (A separate Linux kernel change made -z execstack only apply to the actual stack, not READ_IMPLIES_EXEC.)

参见 为什么是 ELF可执行文件可能有 4 个 LOAD 段? 对于这段历史,包括一个奇怪的事实,即 .rodata 与用于访问 ELF 元数据的只读映射位于一个单独的段中.

See Why an ELF executable could have 4 LOAD segments? for this history, including the strange fact that .rodata is in a separate segment from the read-only mapping for access to the ELF metadata.

额外的空间只是 00 填充,可以很好地压缩在 .tar.gz 或其他内容中.

That extra space is just 00 padding and will compress well in a .tar.gz or whatever.

因此它的最坏情况上限约为 2x 4k 额外页面的填充,而微小的可执行文件接近最坏情况.

gcc -Wl,--nmagic 如果出于某种原因需要,将关闭部分的页面对齐.(请参阅 ld(1) 手册页) 我不知道为什么这不能把所有东西都压缩到旧尺寸.也许检查默认链接器脚本会有所启发,但它很长.运行 ld --verbose 来查看它.

gcc -Wl,--nmagic will turn off page-alignment of sections if you want that for some reason. (see the ld(1) man page) I don't know why that doesn't pack everything down to the old size. Perhaps checking the default linker script would shed some light, but it's pretty long. Run ld --verbose to see it.

stripping 对作为部分的一部分的填充没有帮助;我认为它只能删除整个部分.

stripping won't help for padding that's part of a section; I think it can only remove whole sections.

ld -znoseparate-code 使用旧的布局,总共只有 2 段来覆盖 .text.rodata 部分,以及 .data.bss 部分.(以及动态链接想要访问的 ELF 元数据.)

ld -z noseparate-code uses the old layout, only 2 total segments to cover the .text and .rodata sections, and the .data and .bss sections. (And the ELF metadata that dynamic linking wants access to.)

这个问题是关于 ld 的,但请注意,如果您使用的是 gcc -nostdlib,那过去也默认生成静态可执行文件.但是现代 Linux 发行版使用 -pie 作为默认配置 GCC,并且 GCC 默认情况下不会制作静态饼图,即使没有链接任何共享库.与 -no-pie 模式不同,在这种情况下它只会生成一个静态可执行文件.(static-pie 仍然需要启动代码来为任何绝对地址应用重定位.)

This question is about ld, but note that if you're using gcc -nostdlib, that used to also default to making a static executable. But modern Linux distros config GCC with -pie as the default, and GCC won't make a static-pie by default even if there aren't any shared libraries being linked. Unlike with -no-pie mode where it will simply make a static executable in that case. (A static-pie still needs startup code to apply relocations for any absolute addresses.)

所以 ld 的直接等价物是 gcc -nostdlib -static(这意味着 -no-pie).或者 gcc -nostdlib -no-pie 应该让它默认为 -static 当没有链接的共享库时.您可以将其与 -Wl,--nmagic 和/或 -Wl,-z -Wl,noseparate-code 结合使用.

So the equivalent of ld directly is gcc -nostdlib -static (which implies -no-pie). Or gcc -nostdlib -no-pie should let it default to -static when there are no shared libs being linked. You can combine this with -Wl,--nmagic and/or -Wl,-z -Wl,noseparate-code.

还有:

  • A Whirlwind Tutorial on Creating Really Teensy ELF Executables for Linux - eventually making a 45 byte executable, with the machine code for an _exit syscall stuffed into the ELF program header itself.

FASM 可以制作非常小的可执行文件,使用它的模式直接输出静态可执行文件(不是目标文件),没有 ELF 部分元数据,只有程序头.(用 GDB 调试或用 objdump 反汇编很痛苦;大多数工具都假设会有节头,即使它们不需要运行静态可执行文件.)

FASM can make quite small executables, using its mode where it outputs a static executable (not object file) directly with no ELF section metadata, just program headers. (It's a pain to debug with GDB or disassemble with objdump; most tools assume there will be section headers, even though they're not needed to run static executables.)

对于包括设置在内的小型 C 程序,合理的最少汇编指令数是多少?

两者有什么区别静态链接"和不是动态可执行文件"来自 Linux ldd?(静态与静态派与(动态)PIE 恰好没有共享库.)

What's the difference between "statically linked" and "not a dynamic executable" from Linux ldd? (static vs. static-pie vs. (dynamic) PIE that happens to have no shared libraries.)

这篇关于对于小程序,链接后的最小可执行文件大小现在比 2 年前大 10 倍?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆