如何将文本文件的内容添加为 ELF 文件中的一个部分? [英] How do I add contents of text file as a section in an ELF file?

查看:38
本文介绍了如何将文本文件的内容添加为 ELF 文件中的一个部分?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个正在组装和链接的 NASM 组装文件(在 Intel-64 Linux 上).

I have a NASM assembly file that I am assembling and linking (on Intel-64 Linux).

有一个文本文件,我希望文本文件的内容出现在生成的二进制文件中(基本上是一个字符串).二进制文件是一个 ELF 可执行文件.

There is a text file, and I want the contents of the text file to appear in the resulting binary (as a string, basically). The binary is an ELF executable.

我的计划是在 ELF 文件中创建一个新的只读数据部分(相当于传统的 .rodata 部分).

My plan is to create a new readonly data section in the ELF file (equivalent to the conventional .rodata section).

理想情况下,应该有一个工具可以将文件逐字添加为 elf 文件中的新部分,或者有一个链接器选项可以逐字包含文件.

Ideally, there would be a tool to add a file verbatim as a new section in an elf file, or a linker option to include a file verbatim.

这可能吗?

推荐答案

使用 BINUTILS 中找到noreferrer">OBJCOPY.您可以有效地将数据文件作为二进制输入,然后将其输出为可以链接到您的程序的目标文件格式.

This is possible and most easily done using OBJCOPY found in BINUTILS. You effectively take the data file as binary input and then output it to an object file format that can be linked to your program.

OBJCOPY 甚至会生成开始和结束符号以及数据区的大小,以便您可以在代码中引用它们.基本思想是你要告诉它你的输入文件是二进制的(即使它是文本);您将针对 x86-64 目标文件;指定输入文件名和输出文件名.

OBJCOPY will even produce a start and end symbol as well as the size of the data area so that you can reference them in your code. The basic idea is that you will want to tell it your input file is binary (even if it is text); that you will be targeting an x86-64 object file; specify the input file name and the output file name.

假设我们有一个名为 myfile.txt 的输入文件,其内容是:

Assume we have an input file called myfile.txt with the contents:

the
quick
brown
fox
jumps
over
the
lazy
dog

这样的事情将是一个起点:

Something like this would be a starting point:

objcopy --input binary 
    --output elf64-x86-64 
    --binary-architecture i386:x86-64 
    myfile.txt myfile.o

如果您想生成 32 位对象,可以使用:

If you wanted to generate 32-bit objects you could use:

objcopy --input binary 
    --output elf32-i386 
    --binary-architecture i386 
    myfile.txt myfile.o

输出将是一个名为 myfile.o 的目标文件.如果我们使用 OBJDUMP 和类似 objdump -x myfile.o 的命令查看目标文件的头文件,我们会看到如下内容:

The output would be an object file called myfile.o . If we were to review the headers of the object file using OBJDUMP and a command like objdump -x myfile.o we would see something like this:

myfile.o:     file format elf64-x86-64
myfile.o
architecture: i386:x86-64, flags 0x00000010:
HAS_SYMS
start address 0x0000000000000000

Sections:
Idx Name          Size      VMA               LMA               File off  Algn
  0 .data         0000002c  0000000000000000  0000000000000000  00000040  2**0
                  CONTENTS, ALLOC, LOAD, DATA
SYMBOL TABLE:
0000000000000000 l    d  .data  0000000000000000 .data
0000000000000000 g       .data  0000000000000000 _binary_myfile_txt_start
000000000000002c g       .data  0000000000000000 _binary_myfile_txt_end
000000000000002c g       *ABS*  0000000000000000 _binary_myfile_txt_size

默认情况下,它创建一个带有文件内容的 .data 部分,并创建许多可用于引用数据的符号.

By default it creates a .data section with contents of the file and it creates a number of symbols that can be used to reference the data.

_binary_myfile_txt_start
_binary_myfile_txt_end
_binary_myfile_txt_size

这实际上是从文件 myfile.txt 放入对象中的起始字节、结束字节和数据大小的地址.OBJCOPY 将基于输入文件名的符号.myfile.txt 被重整为 myfile_txt 并用于创建符号.

This is effectively the address of the start byte, the end byte, and the size of the data that was placed into the object from the file myfile.txt. OBJCOPY will base the symbols on the input file name. myfile.txt is mangled into myfile_txt and used to create the symbols.

一个问题是创建了一个 .data 部分,它是读/写/数据,如下所示:

One problem is that a .data section is created which is read/write/data as seen here:

Idx Name          Size      VMA               LMA               File off  Algn
  0 .data         0000002c  0000000000000000  0000000000000000  00000040  2**0
                  CONTENTS, ALLOC, LOAD, DATA

您特别请求一个 .rodata 部分,该部分也指定了 READONLY 标志.您可以使用 --rename-section 选项将 .data 更改为 .rodata 并指定所需的标志.您可以将其添加到命令行:

You specifically are requesting a .rodata section that would also have the READONLY flag specified. You can use the --rename-section option to change .data to .rodata and specify the needed flags. You could add this to the command line:

--rename-section .data=.rodata,CONTENTS,ALLOC,LOAD,READONLY,DATA

当然,如果您想使用与只读部分相同的标志来调用 .rodata 以外的部分,您可以将上面一行中的 .rodata 更改为要用于该部分的名称.

Of course if you want to call the section something other than .rodata with the same flags as a read only section you can change .rodata in the line above to the name you want to use for the section.

应该生成您想要的对象类型的命令的最终版本是:

The final version of the command that should generate the type of object you want is:

objcopy --input binary 
    --output elf64-x86-64 
    --binary-architecture i386:x86-64 
    --rename-section .data=.rodata,CONTENTS,ALLOC,LOAD,READONLY,DATA 
    myfile.txt myfile.o

现在您有了一个目标文件,如何在 C 代码中使用它(作为示例).生成的符号有点不寻常,OS Dev Wiki:

Now that you have an object file, how can you use this in C code (as an example). The symbols generated are a bit unusual and there is a reasonable explanation on the OS Dev Wiki:

一个常见的问题是在尝试使用链接描述文件中定义的值时获取垃圾数据.这通常是因为他们正在取消引用符号.链接描述文件中定义的符号(例如 _ebss = .;)只是一个符号,而不是变量.如果您使用 extern uint32_t _ebss 访问符号;然后尝试使用_ebss,代码将尝试从_ebss 指示的地址读取一个32 位整数.

A common problem is getting garbage data when trying to use a value defined in a linker script. This is usually because they're dereferencing the symbol. A symbol defined in a linker script (e.g. _ebss = .;) is only a symbol, not a variable. If you access the symbol using extern uint32_t _ebss; and then try to use _ebss the code will try to read a 32-bit integer from the address indicated by _ebss.

对此的解决方案是获取 _ebss 的地址,方法是将其用作 &_ebss 或将其定义为无大小数组 (extern char _ebss[];) 并转换为整数.(数组表示法可以防止从 _ebss 意外读取,因为必须显式取消引用数组)

The solution to this is to take the address of _ebss either by using it as &_ebss or by defining it as an unsized array (extern char _ebss[];) and casting to an integer. (The array notation prevents accidental reads from _ebss as arrays must be explicitly dereferenced)

记住这一点,我们可以创建名为 main.cC 文件:

Keeping this in mind we could create this C file called main.c:

#include <stdint.h>
#include <stdlib.h>
#include <stdio.h>

/* These are external references to the symbols created by OBJCOPY */
extern char _binary_myfile_txt_start[];
extern char _binary_myfile_txt_end[];
extern char _binary_myfile_txt_size[];

int main()
{
    char *data_start     = _binary_myfile_txt_start;
    char *data_end       = _binary_myfile_txt_end;
    size_t data_size  = (size_t)_binary_myfile_txt_size;

    /* Print out the pointers and size */
    printf ("data_start %p
", data_start);
    printf ("data_end   %p
", data_end);
    printf ("data_size  %zu
", data_size);

    /* Print out each byte until we reach the end */
    while (data_start < data_end)
        printf ("%c", *data_start++);

    return 0;
}

您可以编译和链接:

gcc -O3 main.c myfile.o

输出应该类似于:

data_start 0x4006a2
data_end   0x4006ce
data_size  44
the
quick
brown
fox
jumps
over
the
lazy
dog

<小时>

NASM 用法示例在本质上类似于 C 代码.以下名为 nmain.asm 的汇编程序使用 Linux 将相同的字符串写入标准输出x86-64 系统调用:


A NASM example of usage is similar in nature to the C code. The following assembly program called nmain.asm writes the same string to standard output using Linux x86-64 System Calls:

bits 64
global _start

extern _binary_myfile_txt_start
extern _binary_myfile_txt_end
extern _binary_myfile_txt_size

section .text

_start:
    mov eax, 1                        ; SYS_Write system call
    mov edi, eax                      ; Standard output FD = 1
    mov rsi, _binary_myfile_txt_start ; Address to start of string
    mov rdx, _binary_myfile_txt_size  ; Length of string
    syscall

    xor edi, edi                      ; Return value = 0
    mov eax, 60                       ; SYS_Exit system call
    syscall

这可以组装和链接:

nasm -f elf64 -o nmain.o nmain.asm
gcc -m64 -nostdlib nmain.o myfile.o

输出应显示为:

the
quick
brown
fox
jumps
over
the
lazy
dog

这篇关于如何将文本文件的内容添加为 ELF 文件中的一个部分?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆