LLVM上的Syscall/sysenter [英] Syscall/sysenter on LLVM
问题描述
如何编写发出特定于体系结构的系统调用指令所需的LLVM位代码?
How do I write the LLVM bitcode required to emit an architecture-specific system call instruction?
更具体地说,clang
支持内联汇编,并明确支持发出系统调用(否则,libc
和vdso
无法编译).翻译是如何工作的,我如何为它打勾以重现此行为?
More specifically, clang
supports inline assembly, and clearly supports emitting system calls (otherwise libc
and vdso
could not be compiled). How does the translation work for this, and how can I tickle it to reproduce this behavior?
我了解到LLVM本身可能不了解各种体系结构所使用的调用接口和注册时间表,这些调用接口和注册时间表以足够高级的方式来表示为LLVM字节码(例如,可以在其他地方填充).但是,显然有一个阶段可以添加该信息.
I understand LLVM itself may not understand the calling interface and register schedule used by various architectures in a sufficiently high-level manner to be expressed in LLVM bytecode (e.g. that may be filled in elsewhere). However, there's clearly a stage where this information can be added.
从带有内联汇编的C源代码"之后的任何阶段开始,我该如何做?
How do I do this, starting at whatever stage comes after "C source with inline assembly"?
令人满意的答案将包括如何调用五参数int 0x80
系统调用的示例.我选择5个是因为这需要溢出到堆栈中,而我选择int 0x80
是因为它很容易理解并且在最常见的平台上使用.
A satisfactory answer would include an example of how to invoke a five-argument int 0x80
system call. I choose five since that requires spilling to the stack, and I choose int 0x80
since it's easily understood and on the most common platform.
推荐答案
自从exa悬赏后,在这里发布了答案.
Posting an answer here since exa has put up a bounty.
我意识到,在罗斯·里奇(Ross Ridge Ridge)发表评论之后,再加上c语,这是一个愚蠢的问题.
I realized this was somewhat a silly question to ask after Ross Ridge's comments, and some playing around with clang.
假设我们有以下程序,该程序使用内联汇编程序直接调用write()
.
Let's assume we have the following program, which uses inline assembly to directly call write()
.
#include <stdio.h>
int main(void)
{
char *buf = "test\n";
ssize_t n;
asm volatile (
"movl $0x00000002, %%edi\n" /* first argument == stderr */
"movl $0x00000006, %%edx\n" /* third argument == number of bytes */
"movl $1, %%eax\n" /* syscall number == write on amd64 linux */
"syscall\n"
: "=A"(n) /* %rax: return value */
: "S"(buf)); /* %rsi: second argument == address of data to write */
return n;
}
我们可以使用gcc
或clang
进行编译,并获得大致相同的结果.
We can compile this with either gcc
or clang
and get roughly the same result.
$ gcc -o syscall.gcc syscall.c
$ clang -o syscall.clang syscall.c
$ ./syscall.gcc
test
$ ./syscall.clang
test
如果我们希望看到用于发出此代码的确切LLVM指令,我们可以简单地使用-emit-llvm
标志.如您所见,有一个call i64 asm sideeffect
行,其中包含完整的内联汇编字符串.
If we wish to see the exact LLVM instructions which would be used to emit this code, we can simply use the -emit-llvm
flag. As you can see, there is a call i64 asm sideeffect
line which has the full inline assembly string.
$ clang -S -emit-llvm syscall.c
$ cat syscall.ll
; ModuleID = 'syscall.c'
target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"
target triple = "x86_64-pc-linux-gnu"
@.str = private unnamed_addr constant [6 x i8] c"test\0A\00", align 1
; Function Attrs: nounwind uwtable
define i32 @main() #0 {
%1 = alloca i32, align 4
%buf = alloca i8*, align 8
%n = alloca i64, align 8
store i32 0, i32* %1
store i8* getelementptr inbounds ([6 x i8]* @.str, i32 0, i32 0), i8** %buf, align 8
%2 = load i8** %buf, align 8
%3 = call i64 asm sideeffect "movl $$0x00000002, %edi\0Amovl $$0x00000006, %edx\0Amovl $$1, %eax\0Asyscall\0A", "=A,{si},~{dirflag},~{fpsr},~{flags}"(i8* %2) #1, !srcloc !1
store i64 %3, i64* %n, align 8
%4 = load i64* %n, align 8
%5 = trunc i64 %4 to i32
ret i32 %5
}
attributes #0 = { nounwind uwtable "less-precise-fpmad"="false" "no-frame-pointer-elim"="true" "no-frame-pointer-elim-non-leaf" "no-infs-fp-math"="false" "no-nans-fp-math"="false" "stack-protector-buffer-size"="8" "unsafe-fp-math"="false" "use-soft-float"="false" }
attributes #1 = { nounwind }
!llvm.ident = !{!0}
!0 = metadata !{metadata !"Ubuntu clang version 3.5-1ubuntu1 (trunk) (based on LLVM 3.5)"}
!1 = metadata !{i32 134, i32 197, i32 259, i32 312}
这篇关于LLVM上的Syscall/sysenter的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!