在GNU C内联汇编中编写Linux int 80h系统调用包装器 [英] Writing a Linux int 80h system-call wrapper in GNU C inline assembly

查看:64
本文介绍了在GNU C内联汇编中编写Linux int 80h系统调用包装器的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试使用内联汇编... 我阅读了此页面 http://www.codeproject.com/KB/cpp/edujini_inline_asm.aspx ,但我无法理解传递给函数的参数.

我正在编写C编写示例.这是我的函数标头:

write2(char *str, int len){
}

这是我的汇编代码:

global write2
write2:
    push ebp
    mov ebp, esp
    mov eax, 4      ;sys_write
    mov ebx, 1      ;stdout
    mov ecx, [ebp+8]    ;string pointer
    mov edx, [ebp+12]   ;string size
    int 0x80        ;syscall
    leave
    ret

我该怎么做才能将该代码传递给C函数...我正在做这样的事情:

write2(char *str, int len){
    asm ( "movl 4, %%eax;"
          "movl 1, %%ebx;"
          "mov %1, %%ecx;"
          //"mov %2, %%edx;"
          "int 0x80;"
           :
           : "a" (str), "b" (len)
    );
}

那是因为我没有输出变量,那么我该如何处理呢? 此外,使用以下代码:

global main
main:
    mov ebx, 5866       ;PID
    mov ecx, 9      ;SIGKILL
    mov eax, 37     ;sys_kill
    int 0x80        ;interruption
    ret 

如何将代码内联到我的代码中..这样我就可以向用户请求pid. 这是我的预编码

void killp(int pid){
    asm ( "mov %1, %%ebx;"
          "mov 9, %%ecx;"
          "mov 37, %%eax;"
           :
           : "a" (pid)         /* optional */
    );
}

解决方案

好吧,您没有特别说明,但是从您的帖子看来,您似乎正在使用带有约束语法的gcc及其内联asm(其他C编译器)具有非常不同的内联语法).就是说,您可能需要使用AT& T汇编语法而不是Intel,因为这就是gcc所使用的.

因此,根据以上所述,让我们看一下您的write2函数.首先,您不想创建一个堆栈框架,因为gcc会创建一个堆栈框架,因此,如果在asm代码中创建一个堆栈框架,最终将得到两个框架,事情可能会变得非常混乱.其次,由于gcc正在布置堆栈框架,因此您无法使用"[ebp + offset]"访问var,因为您不知道它的布置方式.

这就是约束的目的-您说您希望gcc在asm代码中将值(任何寄存器,内存,特定寄存器)放置在哪种位置,并使用%X".最后,如果在asm代码中使用显式寄存器,则需要在第3部分(在输入约束之后)列出它们,以便gcc知道您正在使用它们.否则,它可能会在其中一个寄存器中添加一些重要的值,而您会破坏该值.

您还需要告诉编译器,内联asm将或可能会读取或写入由输入操作数指向的内存;这是暗示的.

因此,您的write2函数看起来像:

void write2(char *str, int len) {
    __asm__ volatile (
        "movl $4, %%eax;"      // SYS_write
        "movl $1, %%ebx;"      // file descriptor = stdout_fd
        "movl %0, %%ecx;"
        "movl %1, %%edx;"
        "int $0x80"
        :: "g" (str), "g" (len)       // input values we MOV from
        : "eax", "ebx", "ecx", "edx", // registers we destroy
          "memory"                    // memory has to be in sync so we can read it
     );
}

请注意AT& T语法-寄存器名称前的src,dest而不是dest,src和%.

现在这可以工作,但是效率低下,因为它将包含许多额外的动作.通常,永远不要在asm代码中使用mov指令或显式寄存器,因为最好使用约束来说出所需的内容,并让编译器确保它们在那里.这样,优化器可能会摆脱大多数mov,特别是如果它内联函数(如果您指定-O3,它将执行此操作).方便的是,i386机器模型对特定的寄存器有约束,因此您可以执行以下操作:

void write2(char *str, int len) {
    __asm__ volatile (
        "movl $4, %%eax;"
        "movl $1, %%ebx;"
        "int $0x80"
        :: "c" (str), /* c constraint tells the compiler to put str in ecx */
           "d" (len)  /* d constraint tells the compiler to put len in edx */
        : "eax", "ebx", "memory");
}

甚至更好

// UNSAFE: destroys EAX (with return value) without telling the compiler
void write2(char *str, int len) {
    __asm__ volatile ("int $0x80"
        :: "a" (4), "b" (1), "c" (str), "d" (len)
        : "memory");
}

还请注意使用volatile,它需要告诉编译器即使未使用其输出(没有输出)也不能将其消除为无效. (没有输出操作数的asm已经隐式地为volatile,但是当其实际目的不是为了计算某些内容时将其显式表示并不会受到损害;这是诸如系统调用之类的副作用.)

修改

最后一点说明-该函数正在执行写系统调用,该调用的确返回eax中的值-写入的字节数或错误代码.因此,您可以通过输出约束来做到这一点:

int write2(const char *str, int len) {
    __asm__ volatile ("int $0x80" 
     : "=a" (len)
     : "a" (4), "b" (1), "c" (str), "d" (len),
       "m"( *(const char (*)[])str )       // "dummy" input instead of memory clobber
     );
    return len;
}

所有系统调用均以EAX返回.从-4095-1(包括)的值是负的errno代码,其他值是非错误. (这全局性适用于所有Linux系统调用.)

如果要编写通用的系统调用包装器,则可能需要"memory"破坏符,因为不同的系统调用具有不同的指针操作数,并且可能是输入或输出.请参见 https://godbolt.org/z/GOXBue ,以了解一个示例,如果您不进行操作,该示例可能会中断,以及此答案以获取有关虚拟内存输入/输出的更多详细信息. >

使用此输出操作数,您需要显式的volatile-每次asm语句在源中运行"时,仅需要一个write系统调用.否则,允许编译器假定其存在只是为了计算其返回值,并且可以消除使用相同输入的重复调用,而不必编写多行. (或者,如果不检查返回值,则将其完全删除.)

I'm trying to use inline assembly... I read this page http://www.codeproject.com/KB/cpp/edujini_inline_asm.aspx but I can't understand the parameters passing to my function.

I'm writing a C write example.. this is my function header:

write2(char *str, int len){
}

And this is my assembly code:

global write2
write2:
    push ebp
    mov ebp, esp
    mov eax, 4      ;sys_write
    mov ebx, 1      ;stdout
    mov ecx, [ebp+8]    ;string pointer
    mov edx, [ebp+12]   ;string size
    int 0x80        ;syscall
    leave
    ret

What do I have to do pass that code to the C function... I'm doing something like this:

write2(char *str, int len){
    asm ( "movl 4, %%eax;"
          "movl 1, %%ebx;"
          "mov %1, %%ecx;"
          //"mov %2, %%edx;"
          "int 0x80;"
           :
           : "a" (str), "b" (len)
    );
}

That's because I don't have an output variable, so how do I handle that? Also, with this code:

global main
main:
    mov ebx, 5866       ;PID
    mov ecx, 9      ;SIGKILL
    mov eax, 37     ;sys_kill
    int 0x80        ;interruption
    ret 

How can I put that code inline in my code.. so I can ask for the pid to the user.. like this.. This is my precode

void killp(int pid){
    asm ( "mov %1, %%ebx;"
          "mov 9, %%ecx;"
          "mov 37, %%eax;"
           :
           : "a" (pid)         /* optional */
    );
}

解决方案

Well, you don't say specifically, but by your post, it appears like you're using gcc and its inline asm with constraints syntax (other C compilers have very different inline syntax). That said, you probably need to use AT&T assembler syntax rather than Intel, as that's what gets used with gcc.

So with the above said, lets look at your write2 function. First, you don't want to create a stack frame, as gcc will create one, so if you create one in the asm code, you'll end up with two frames, and things will probably get very confused. Second, since gcc is laying out the stack frame, you can't access vars with "[ebp + offset]" as you don't know how it's being laid out.

That's what the constraints are for -- you say what kind of place you want gcc to put the value (any register, memory, specific register) and the use "%X" in the asm code. Finally, if you use explicit registers in the asm code, you need to list them in the 3rd section (after the input constraints) so gcc knows you are using them. Otherwise it might put some important value in one of those registers, and you'd clobber that value.

You also need to tell the compiler that inline asm will or might read from or write to memory pointed-to by the input operands; that is not implied.

So with all that, your write2 function looks like:

void write2(char *str, int len) {
    __asm__ volatile (
        "movl $4, %%eax;"      // SYS_write
        "movl $1, %%ebx;"      // file descriptor = stdout_fd
        "movl %0, %%ecx;"
        "movl %1, %%edx;"
        "int $0x80"
        :: "g" (str), "g" (len)       // input values we MOV from
        : "eax", "ebx", "ecx", "edx", // registers we destroy
          "memory"                    // memory has to be in sync so we can read it
     );
}

Note the AT&T syntax -- src, dest rather than dest, src and % before the register name.

Now this will work, but its inefficient as it will contain lots of extra movs. In general, you should NEVER use mov instructions or explicit registers in asm code, as you're much better off using constraints to say where you want things and let the compiler ensure that they're there. That way, the optimizer can probably get rid of most of the movs, particularly if it inlines the function (which it will do if you specify -O3). Conveniently, the i386 machine model has constraints for specific registers, so you can instead do:

void write2(char *str, int len) {
    __asm__ volatile (
        "movl $4, %%eax;"
        "movl $1, %%ebx;"
        "int $0x80"
        :: "c" (str), /* c constraint tells the compiler to put str in ecx */
           "d" (len)  /* d constraint tells the compiler to put len in edx */
        : "eax", "ebx", "memory");
}

or even better

// UNSAFE: destroys EAX (with return value) without telling the compiler
void write2(char *str, int len) {
    __asm__ volatile ("int $0x80"
        :: "a" (4), "b" (1), "c" (str), "d" (len)
        : "memory");
}

Note also the use of volatile which is needed to tell the compiler that this can't be eliminated as dead even though its outputs (of which there are none) are not used. (asm with no output operands is already implicitly volatile, but making it explicit doesn't hurt when the real purpose isn't to calculate something; it's for a side effect like a system call.)

edit

One final note -- this function is doing a write system call, which does return a value in eax -- either the number of bytes written or an error code. So you can get that with an output constraint:

int write2(const char *str, int len) {
    __asm__ volatile ("int $0x80" 
     : "=a" (len)
     : "a" (4), "b" (1), "c" (str), "d" (len),
       "m"( *(const char (*)[])str )       // "dummy" input instead of memory clobber
     );
    return len;
}

All system calls return in EAX. Values from -4095 to -1 (inclusive) are negative errno codes, other values are non-errors. (This applies globally to all Linux system calls).

If you're writing a generic system-call wrapper, you probably need a "memory" clobber because different system calls have different pointer operands, and might be inputs or outputs. See https://godbolt.org/z/GOXBue for an example that breaks if you leave it out, and this answer for more details about dummy memory inputs/outputs.

With this output operand, you need the explicit volatile -- exactly one write system call per time the asm statement "runs" in the source. Otherwise the compiler is allowed to assume that it exists only to compute its return value, and can eliminate repeated calls with the same input instead of writing multiple lines. (Or remove it entirely if you didn't check the return value.)

这篇关于在GNU C内联汇编中编写Linux int 80h系统调用包装器的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆