从64位汇编中调用C函数 [英] Call C functions from 64-bit assembly

查看:242
本文介绍了从64位汇编中调用C函数的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在Ubuntu 16.04上

On ubuntu 16.04

$ cat hola.asm

    extern puts
    global main

    section .text
main:
    mov rdi,message
    call puts
    ret

message:
    db  "Hola",0

$ nasm -f elf64 hola.asm  
$ gcc hola.o

/usr/bin/ld:hola.o:针对符号重定位R_X86_64_PC32 创建共享库时,不能使用"puts @@ GLIBC_2.2.5"; 用-fPIC
重新编译 /usr/bin/ld:最终链接失败:错误的值 collect2:错误:ld返回1退出状态

/usr/bin/ld: hola.o: relocation R_X86_64_PC32 against symbol `puts@@GLIBC_2.2.5' can not be used when making a shared object; recompile with -fPIC
/usr/bin/ld: final link failed: Bad value collect2: error: ld returned 1 exit status

使用:

$gcc -fPIC hola.o -o hola && ./hola
Hola

文档:

-fPIC如果目标计算机支持,则发出与位置无关的代码,适合动态链接并避免大小限制 全局偏移量表的值.此选项对 AArch64,m68k,PowerPC和SPARC.

-fPIC If supported for the target machine, emit position-independent code, suitable for dynamic linking and avoiding any limit on the size of the global offset table. This option makes a difference on AArch64, m68k, PowerPC and SPARC.

与位置无关的代码需要特殊的支持,因此 仅在某些机器上工作.设置此标志后,宏 " pic "和" PIC "定义为2.与位置无关的代码 需要特殊的支持,因此仅适用于某些情况 机器.

Position-independent code requires special support, and therefore works only on certain machines. When this flag is set, the macros "pic" and "PIC" are defined to 2. Position-independent code requires special support, and therefore works only on certain machines.

gcc的-static选项有效:

The -static option with gcc works:

使用-static完全避免外部调用动态库

use -static to completely avoid external calls to dynamic libraries

$nasm -f elf64 -l hola.lst hola.asm && gcc -m64 -static -o hola hola.o && ./hola
Hola

还有:

$nasm -f elf64 hello.asm && gcc -static -o hola hola.o && ./hola Hola

包括wrt ..plt也有效

Including wrt ..plt also worked

 global main
    extern puts

    section .text
main:
    mov rdi,message
    call puts wrt ..plt
    ret
message:
    db "Hola", 0




$nasm -f elf64 hola.asm
$gcc -m64 -o hola hola.o && ./hola
Hola

来自 ..plt描述

.. plt 使用wrt ..plt引用过程名称会导致链接程序为该符号建立过程链接表条目,而引用将给出PLT条目的地址.您只能在会正常生成PC相对重定位(即作为CALL或JMP的目的地)的上下文中使用此功能,因为ELF不包含任何重定位类型来绝对引用PLT条目.

..plt Referring to a procedure name using wrt ..plt causes the linker to build a procedure linkage table entry for the symbol, and the reference gives the address of the PLT entry. You can only use this in contexts which would generate a PC-relative relocation normally (i.e. as the destination for CALL or JMP), since ELF contains no relocation type to refer to PLT entries absolutely.

推荐答案

我编写了此程序,使其与hi.c程序相同,而没有调用c lib.然后建议在hi.c上使用-S gcc选项,然后剖析生成的hi.s程序.

I wrote up this program to do the same as the hi.c program, without the c lib call. Then followed a suggestion to use the -S gcc option on hi.c then to dissect the resulting hi.s program.

$ cat hiasm.asm

section .text
    global _start

_start:

    mov     dl, 5
    mov     esi, msg
    xor     di,di
    xor     al,al
    inc     di
    inc     al
    syscall

    xor     rdi,rdi 
    mov al,60
    syscall

msg:    db "Hello"

$ nasm -f elf64 hiasm.asm&& ld -m elf_x86_64 hiasm.o -o hiasm&& ./hiasm

你好

$ echo $?

0

这很好用

再次,这是简单的hi.c

again, here's the simple hi.c

$ cat hi.c

#include <stdio.h>

int main(void)
{
    puts("Hello");
    return 0;
}

$ gcc -s hi.c&&嗨!

    .file   "hi.c"
    .section    .rodata
.LC0:
    .string "Hello"
    .text
    .globl  main
    .type   main, @function
main:
.LFB0:
    .cfi_startproc
    pushq   %rbp
    .cfi_def_cfa_offset 16
    .cfi_offset 6, -16
    movq    %rsp, %rbp
    .cfi_def_cfa_register 6
    leaq    .LC0(%rip), %rdi
    call    puts@PLT
    movl    $0, %eax
    popq    %rbp
    .cfi_def_cfa 7, 8
    ret
    .cfi_endproc
.LFE0:
    .size   main, .-main
    .ident  "GCC: (Debian 6.3.0-18) 6.3.0 20170516"
    .section    .note.GNU-stack,"",@progbits

$ gcc hi.s -o hi&& ./hi

你好

.s文件中似乎未引用标签.LFB0和.LFE0 删除两个文件后,它们仍然可以按预期工作, 引用"as"汇编程序文档:

The labels .LFB0 and .LFE0 do not appear to be referenced within the .s file After removing both the file still works as expected, referencing 'as' assembler docs:

https://sourceware.org/binutils/docs/as/index.html

本地符号是在汇编程序中定义和使用的,但是它们是 通常不保存在目标文件中.因此,它们在以下情况下不可见 调试.您可以使用-L选项(请参阅包括本地符号)来 在目标文件中保留本地符号.

Local symbols are defined and used within the assembler, but they are normally not saved in object files. Thus, they are not visible when debugging. You may use the `-L' option (see Include Local Symbols) to retain the local symbols in the object files.

因此,作为不需要钟声的纯可执行文件,可以将它们切碎

So as a pure executable with no need for bells and whistles, they can be chopped

所以我摆脱了那些简单的事情

So I got rid of the easy ones

接下来,该函数要调用main,这没什么用,所以我将调用_start

Next the function wants to call main, there's not much use for this, so I'll call _start

对于ELF目标,.size指令的用法如下:

For ELF targets, the .size directive is used like this:

 .size name , expression

此伪指令设置与符号名称关联的大小.尺寸 以字节为单位是根据可以使用标签的表达式计算得出的 算术.该指令通常用于设置 功能符号.

This directive sets the size associated with a symbol name. The size in bytes is computed from expression which can make use of label arithmetic. This directive is typically used to set the size of function symbols.

不需要功能符号的大小,摆脱了引用main底部的.size

Don't need function symbol sizes, got rid of the .size at the bottom that references main

$ cat hi.s

file    "hi.c"          ##tells 'as' that we are about to start a new logical file
        .section    .rodata     ##assembles the following code into section '.rodata'
    .LC0:                   ##.LC0, .LFB0, .LFE0 are just local labels; symbols that
                    ##  are guaranteed to be unique over the source code
                    ##  that allow the compiler to use names/simple notation
                    ##  to reference sections of code
                    ##But here, only .LC0 is actually referenced in the code

    .string "Hello"         ##
    .text
    .globl  _start
_start:
    .cfi_startproc          ##used at the beginning of each function that should have an
                    ##entry in .eh_frame. It initializes some internal data
                    ##structures. Don't forget to close by .cfi_endproc
    pushq   %rbp            ##push base pointer onto stack

    .cfi_def_cfa_offset 16      ##modifies a rule for computing CFA. Register remains the
                    ##same, but offset is new. Note that it is the absolute
                    ##offset that will be added to a defined register to
                    ##compute CFA address
    .cfi_offset 6, -16
    movq    %rsp, %rbp
    .cfi_def_cfa_register 6
    leaq    .LC0(%rip), %rdi
    call    puts@PLT
    movl    $0, %eax
    popq    %rbp
    .cfi_def_cfa 7, 8
    ret
    .cfi_endproc            ##close of .cfi_startproc

    .ident  "GCC: (Debian 6.3.0-18) 6.3.0 20170516"
    .section    .note.GNU-stack,"",@progbits

尝试:

$ gcc -o hi hi.s

/tmp/ccLxG1jh.o: In function `_start':
hi.c:(.text+0x0): multiple definition of `_start'
/usr/lib/gcc/x86_64-linux-gnu/6/../../../x86_64-linux-gnu/Scrt1.o:(.text+0x0): first defined here
/usr/lib/gcc/x86_64-linux-gnu/6/../../../x86_64-linux-gnu/Scrt1.o: In function `_start':
(.text+0x20): undefined reference to `main'
collect2: error: ld returned 1 exit status

$嗨

linux-vdso.so.1 (0x00007fffb6569000)
libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007fe7456e7000)
/lib64/ld-linux-x86-64.so.2 (0x000055edc8bc8000)

肯定是在使用libc,它解释了_start的多个定义 因此,我将尝试使用-nostdlib gcc选项摆脱std lib

It's definitely using libc, which explains our multiple definitions of _start So I'll try getting rid of std lib with the -nostdlib gcc option

$ gcc -nostdlib -o hi hi.s

/tmp/ccV5QYaT.o: In function `_start':
hi.c:(.text+0xc): undefined reference to puts'
collect2: error: ld returned 1 exit status

对,仍然需要C进行看跌期权,摆脱掉看跌期权

Right, still need C for puts, getting rid of puts

.file   "hi.c"          ##tells 'as' that we are about to start a new logical file
.section    .rodata     ##assembles the following code into section '.rodata'
.LC0:                   ##.LC0, .LFB0, .LFE0 are just local labels; symbols that
                    ##  are guaranteed to be unique over the source code
                    ##  that allow the compiler to use names/simple notation
                    ##  to reference sections of code
                ##But here, only .LC0 is actually referenced in the code

.string "Hello"         ##
.text
.globl  _start
_start:
    .cfi_startproc          ##used at the beginning of each function that should have an
                    ##entry in .eh_frame. It initializes some internal data
                    ##structures. Don't forget to close by .cfi_endproc
    pushq   %rbp            ##push base pointer onto stack

.cfi_def_cfa_offset 16      ##modifies a rule for computing CFA. Register remains the
                ##same, but offset is new. Note that it is the absolute
                ##offset that will be added to a defined register to
                ##compute CFA address
.cfi_offset 6, -16
movq    %rsp, %rbp
.cfi_def_cfa_register 6
leaq    .LC0(%rip), %rsi     ##this reg value and others were changed for write call
movq    $1, %rax
movq    $1, %rdi
movq    $5, %rdx
syscall

movl    $0, %eax
popq    %rbp
.cfi_def_cfa 7, 8
ret
.cfi_endproc            ##close of .cfi_startproc

$ gcc -nostdlib -o hi.s&& ./hi

HelloSegmentation错误

有前途的

.file   "hi.c"          ##tells 'as' that we are about to start a new logical file
.section    .rodata     ##assembles the following code into section '.rodata'
.LC0:                   ##.LC0, .LFB0, .LFE0 are just local labels; symbols that
                    ##  are guaranteed to be unique over the source code
                    ##  that allow the compiler to use names/simple notation
                    ##  to reference sections of code
                    ##But here, only .LC0 is actually referenced in the code

.string "Hello"         
.text
.globl  _start
_start:
    .cfi_startproc          ##used at the beginning of each function that should have an
                    ##entry in .eh_frame. It initializes some internal data
                    ##structures. Don't forget to close by .cfi_endproc

##deleted the base pointer push and pops from stack, don't need stack

.cfi_def_cfa_offset 16      ##modifies a rule for computing CFA. Register remains the
                ##same, but offset is new. Note that it is the absolute
                ##offset that will be added to a defined register to
                ##compute CFA address
.cfi_offset 6, -16
movq    %rsp, %rbp
.cfi_def_cfa_register 6
leaq    .LC0(%rip), %rsi
movq    $1, %rax
movq    $1, %rdi
movq    $5, %rdx
syscall

xor %rdi,%rdi   
mov $60, %rax
.cfi_def_cfa 7, 8
syscall
.cfi_endproc            ##close of .cfi_startproc

$ gcc -g -nostdlib -o hi hi.s&& ./hi 你好

知道了! 试图弄清楚CFA是什么 http://dwarfstd.org/doc/DWARF4.pdf 6.4

Got it! Trying to figure out what a CFA is http://dwarfstd.org/doc/DWARF4.pdf Section 6.4

在堆栈上分配的内存区域,称为调用帧". 调用帧由堆栈上的地址标识.我们指的是 此地址为规范框架地址或CFA.通常, CFA被定义为调用时堆栈指针的值 前一帧中的网站(可能与其在 进入当前帧)

An area of memory that is allocated on a stack called a "call frame." The call frame is identified by an address on the stack. We refer to this address as the Canonical Frame Address or CFA. Typically, the CFA is defined to be the value of th e stack pointer at the call site in the previous frame (which may be different from its value on entry to the current frame)

因此,所有.cfi_def_cfa_offset,.cfi_offset和.cfi_def_cfa_register都在进行计算, 和操作堆栈.但是该程序根本不需要堆栈,因此最好将其删除

So then all .cfi_def_cfa_offset, .cfi_offset and .cfi_def_cfa_register are doing is computing, and manipulating the stack. But this program doesn't need the stack at all, so might as well delete that too

$ cat hi.s

.file   "hi.c"          ##tells 'as' that we are about to start a new logical file
    .section    .rodata     ##assembles the following code into section '.rodata'
.LC0:                   ##.LC0, .LFB0, .LFE0 are just local labels; symbols that
                    ##  are guaranteed to be unique over the source code
                    ##  that allow the compiler to use names/simple notation
                    ##  to reference sections of code
                    ##But here, only .LC0 is actually referenced in the code

.string "Hello"         
.text


.globl  _start
_start:
    .cfi_startproc          ##used at the beginning of each function that should have an
                    ##entry in .eh_frame. It initializes some internal data
                    ##structures. Don't forget to close by .cfi_endproc
    leaq    .LC0(%rip), %rsi
    movq    $1, %rax
    movq    $1, %rdi
    movq    $5, %rdx
    syscall

xor %rdi,%rdi   
mov $60, %rax
syscall
.cfi_endproc            ##close of .cfi_startproc

.cfi_startproc:

.cfi_startproc :

在每个应该在其中添加条目的函数的开头使用 .eh_frame

Used at the beginning of each function that should have an entry in the .eh_frame

什么是eh_frame 使用诸如C ++之类的支持异常的语言时,必须向运行时环境提供其他信息,这些信息描述了在异常处理过程中要取消处理的调用帧.此信息包含在特殊部分.eh_frame和.eh_framehdr中."

What is eh_frame "When using languages that support exceptions, such as C++, additional information must be provided to the runtime environment that describes the call frames that much be unwound during the processing of an exception. This information is contained in the special sections .eh_frame and .eh_framehdr."

不需要异常处理,不使用C ++

Don't need exception handling, not using C++

$ cat hi.s

$ cat hi.s

.section    .rodata     
.LC0:                   

.string "Hello"         

.text
.globl  _start
_start:
    leaq    .LC0(%rip), %rsi
    movq    $1, %rax
    movq    $1, %rdi
    movq    $5, %rdx
    syscall

xor %rdi,%rdi   
mov $60, %rax
syscall

这篇关于从64位汇编中调用C函数的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆