用 GNU C 内联汇编编写 Linux int 80h 系统调用包装器 [英] Writing a Linux int 80h system-call wrapper in GNU C inline assembly

查看:20
本文介绍了用 GNU C 内联汇编编写 Linux int 80h 系统调用包装器的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试使用内联汇编...我读了这个页面 http://www.codeproject.com/KB/cpp/edujini_inline_asm.aspx 但我无法理解传递给我的函数的参数.

我正在写一个 C 写示例..这是我的函数头:

write2(char *str, int len){}

这是我的汇编代码:

全局写入2写2:推ebpmov ebp, espmov eax, 4 ;sys_writemov ebx, 1 ;标准输出mov ecx, [ebp+8] ;字符串指针mov edx, [ebp+12] ;字符串大小int 0x80 ;系统调用离开退

我该怎么做才能将该代码传递给 C 函数...我正在做这样的事情:

write2(char *str, int len){asm ( "movl 4, %%eax;""movl 1, %%ebx;"移动 %1,%%ecx;"//"移动 %2, %%edx;"整数 0x80;":: "a" (str), "b" (len));}

那是因为我没有输出变量,那么我该如何处理呢?另外,使用此代码:

全局主主要的:mov ebx, 5866 ;PIDmov ecx, 9 ;SIGKILLmov eax, 37 ;sys_killint 0x80 ;中断退

我怎样才能将该代码内联到我的代码中.. 这样我就可以向用户询问 pid.. 像这样..这是我的预编码

void killp(int pid){asm ( "mov %1, %%ebx;"移动 9,%%ecx;"移动 37,%%eax;":: "a" (pid)/* 可选 */);}

解决方案

好吧,您没有具体说,但是通过您的帖子,您似乎正在使用 gcc 及其带有约束语法的内联 asm(其他 C 编译器有非常不同的内联语法).也就是说,您可能需要使用 AT&T 汇编程序语法而不是 Intel,因为 gcc 会使用这种语法.

综上所述,让我们看看你的 write2 函数.首先,你不想创建一个栈帧,因为 gcc 会创建一个,所以如果你在 asm 代码中创建一个,你最终会得到两个帧,事情可能会变得很混乱.其次,由于 gcc 正在布置堆栈帧,因此您无法使用[ebp + offset]"访问变量,因为您不知道它是如何布置的.

这就是约束的用途——你说你想让 gcc 把值(任何寄存器、内存、特定寄存器)放在什么样的地方,并在 asm 代码中使用%X".最后,如果你在 asm 代码中使用显式寄存器,你需要在第三部分(在输入约束之后)列出它们,以便 gcc 知道你正在使用它们.否则,它可能会在其中一个寄存器中放入一些重要的值,而您会破坏该值.

您还需要告诉编译器内联 asm 将或可能会读取或写入输入操作数指向的内存;这不是暗示.

因此,您的 write2 函数如下所示:

void write2(char *str, int len) {__asm__ 易失性 ("movl $4, %%eax;"//SYS_write"movl $1, %%ebx;"//文件描述符 = stdout_fd"movl %0, %%ecx;""movl %1, %%edx;"int $0x80":: "g" (str), "g" (len)//我们从中移动的输入值: "eax", "ebx", "ecx", "edx",//我们销毁的寄存器"memory"//内存必须同步以便我们可以读取它);}

注意 AT&T 语法——寄存器名称前的 src、dest 而不是 dest、src 和 %.

现在这将起作用,但效率低下,因为它将包含许多额外的 movs.一般来说,你永远不应该在 asm 代码中使用 mov 指令或显式寄存器,因为你最好使用约束来说明你想要的东西,让编译器确保它们在那里.这样,优化器可能会摆脱大部分 movs,特别是如果它内联函数(如果您指定 -O3,它将这样做).方便的是,i386 机器模型对特定寄存器有约束,因此您可以改为:

void write2(char *str, int len) {__asm__ 易失性 ("movl $4, %%eax;""movl $1, %%ebx;"int $0x80":: "c" (str),/* c 约束告诉编译器将 str 放入 ecx */"d" (len)/* d 约束告诉编译器将 len 放入 edx */: "eax", "ebx", "memory");}

甚至更好

//UNSAFE:在不告诉编译器的情况下销毁 EAX(带返回值)void write2(char *str, int len) {__asm__ volatile ("int $0x80":: "a" (4), "b" (1), "c" (str), "d" (len): 记忆");}

还要注意 volatile 的使用,它需要告诉编译器即使没有使用它的输出(其中没有输出),也不能将其消除为死.(没有输出操作数的 asm 已经是隐式的 volatile,但是当真正的目的不是计算某些东西时,让它显式不会有什么坏处;这是为了副作用,比如一个系统调用.)

编辑

最后一点——这个函数正在执行一个 write 系统调用,它确实在 eax 中返回一个值——写入的字节数或错误代码.所以你可以通过输出约束得到它:

int write2(const char *str, int len) {__asm__ volatile ("int $0x80": "=a" (len): "a" (4), "b" (1), "c" (str), "d" (len),"m"( *(const char (*)[])str )//虚拟"输入而不是内存破坏);返回 len;}

所有系统调用都在 EAX 中返回.-4095-1(含)的值为负 errno 代码,其他值为非错误.(这适用于所有 Linux 系统调用全局).

如果您正在编写通用系统调用包装器,您可能需要一个 "memory" 破坏器,因为不同的系统调用具有不同的指针操作数,并且可能是输入或输出.请参阅 https://godbolt.org/z/GOXBue 以获取如果您忽略它会中断的示例,以及这个答案,了解有关虚拟内存输入/输出的更多详细信息.>

使用此输出操作数,您需要显式的 volatile —— 每次 asm 语句运行"在来源.否则编译器可以假设它的存在只是为了计算它的返回值,并且可以消除对相同输入的重复调用而不是编写多行.(或者如果您没有检查返回值,则将其完全删除.)

I'm trying to use inline assembly... I read this page http://www.codeproject.com/KB/cpp/edujini_inline_asm.aspx but I can't understand the parameters passing to my function.

I'm writing a C write example.. this is my function header:

write2(char *str, int len){
}

And this is my assembly code:

global write2
write2:
    push ebp
    mov ebp, esp
    mov eax, 4      ;sys_write
    mov ebx, 1      ;stdout
    mov ecx, [ebp+8]    ;string pointer
    mov edx, [ebp+12]   ;string size
    int 0x80        ;syscall
    leave
    ret

What do I have to do pass that code to the C function... I'm doing something like this:

write2(char *str, int len){
    asm ( "movl 4, %%eax;"
          "movl 1, %%ebx;"
          "mov %1, %%ecx;"
          //"mov %2, %%edx;"
          "int 0x80;"
           :
           : "a" (str), "b" (len)
    );
}

That's because I don't have an output variable, so how do I handle that? Also, with this code:

global main
main:
    mov ebx, 5866       ;PID
    mov ecx, 9      ;SIGKILL
    mov eax, 37     ;sys_kill
    int 0x80        ;interruption
    ret 

How can I put that code inline in my code.. so I can ask for the pid to the user.. like this.. This is my precode

void killp(int pid){
    asm ( "mov %1, %%ebx;"
          "mov 9, %%ecx;"
          "mov 37, %%eax;"
           :
           : "a" (pid)         /* optional */
    );
}

解决方案

Well, you don't say specifically, but by your post, it appears like you're using gcc and its inline asm with constraints syntax (other C compilers have very different inline syntax). That said, you probably need to use AT&T assembler syntax rather than Intel, as that's what gets used with gcc.

So with the above said, lets look at your write2 function. First, you don't want to create a stack frame, as gcc will create one, so if you create one in the asm code, you'll end up with two frames, and things will probably get very confused. Second, since gcc is laying out the stack frame, you can't access vars with "[ebp + offset]" as you don't know how it's being laid out.

That's what the constraints are for -- you say what kind of place you want gcc to put the value (any register, memory, specific register) and the use "%X" in the asm code. Finally, if you use explicit registers in the asm code, you need to list them in the 3rd section (after the input constraints) so gcc knows you are using them. Otherwise it might put some important value in one of those registers, and you'd clobber that value.

You also need to tell the compiler that inline asm will or might read from or write to memory pointed-to by the input operands; that is not implied.

So with all that, your write2 function looks like:

void write2(char *str, int len) {
    __asm__ volatile (
        "movl $4, %%eax;"      // SYS_write
        "movl $1, %%ebx;"      // file descriptor = stdout_fd
        "movl %0, %%ecx;"
        "movl %1, %%edx;"
        "int $0x80"
        :: "g" (str), "g" (len)       // input values we MOV from
        : "eax", "ebx", "ecx", "edx", // registers we destroy
          "memory"                    // memory has to be in sync so we can read it
     );
}

Note the AT&T syntax -- src, dest rather than dest, src and % before the register name.

Now this will work, but its inefficient as it will contain lots of extra movs. In general, you should NEVER use mov instructions or explicit registers in asm code, as you're much better off using constraints to say where you want things and let the compiler ensure that they're there. That way, the optimizer can probably get rid of most of the movs, particularly if it inlines the function (which it will do if you specify -O3). Conveniently, the i386 machine model has constraints for specific registers, so you can instead do:

void write2(char *str, int len) {
    __asm__ volatile (
        "movl $4, %%eax;"
        "movl $1, %%ebx;"
        "int $0x80"
        :: "c" (str), /* c constraint tells the compiler to put str in ecx */
           "d" (len)  /* d constraint tells the compiler to put len in edx */
        : "eax", "ebx", "memory");
}

or even better

// UNSAFE: destroys EAX (with return value) without telling the compiler
void write2(char *str, int len) {
    __asm__ volatile ("int $0x80"
        :: "a" (4), "b" (1), "c" (str), "d" (len)
        : "memory");
}

Note also the use of volatile which is needed to tell the compiler that this can't be eliminated as dead even though its outputs (of which there are none) are not used. (asm with no output operands is already implicitly volatile, but making it explicit doesn't hurt when the real purpose isn't to calculate something; it's for a side effect like a system call.)

edit

One final note -- this function is doing a write system call, which does return a value in eax -- either the number of bytes written or an error code. So you can get that with an output constraint:

int write2(const char *str, int len) {
    __asm__ volatile ("int $0x80" 
     : "=a" (len)
     : "a" (4), "b" (1), "c" (str), "d" (len),
       "m"( *(const char (*)[])str )       // "dummy" input instead of memory clobber
     );
    return len;
}

All system calls return in EAX. Values from -4095 to -1 (inclusive) are negative errno codes, other values are non-errors. (This applies globally to all Linux system calls).

If you're writing a generic system-call wrapper, you probably need a "memory" clobber because different system calls have different pointer operands, and might be inputs or outputs. See https://godbolt.org/z/GOXBue for an example that breaks if you leave it out, and this answer for more details about dummy memory inputs/outputs.

With this output operand, you need the explicit volatile -- exactly one write system call per time the asm statement "runs" in the source. Otherwise the compiler is allowed to assume that it exists only to compute its return value, and can eliminate repeated calls with the same input instead of writing multiple lines. (Or remove it entirely if you didn't check the return value.)

这篇关于用 GNU C 内联汇编编写 Linux int 80h 系统调用包装器的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆