在Linux内核中包装函数时遇到麻烦 [英] Having trouble wrapping functions in the linux kernel

查看:82
本文介绍了在Linux内核中包装函数时遇到麻烦的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我已经编写了一个LKM,可在您的内核中实现可信路径执行(TPE):

I've written a LKM that implements Trusted Path Execution (TPE) into your kernel:

https://github.com/cormander/tpe-lkm

当我将 WRAP_SYSCALLS 定义为1时,偶尔会遇到内核OOPS(在此问题结尾处进行描述),而我的机智正试图对其进行跟踪.

I run into an occasional kernel OOPS (describe at the end of this question) when I define WRAP_SYSCALLS to 1, and am at my wit's end trying to track it down.

一些背景:

由于LSM框架不会导出其符号,因此我不得不对如何将TPE检查插入正在运行的内核中产生创意.我编写了一个find_symbol_address()函数,该函数为我提供了我需要的任何函数的地址,并且效果很好.我可以这样调用函数:

Since the LSM framework doesn't export its symbols, I had to get creative with how I insert the TPE checking into the running kernel. I wrote a find_symbol_address() function that gives me the address of any function I need, and it works very well. I can call functions like this:

int (*my_printk)(const char *fmt, ...);
my_printk = find_symbol_address("printk");
(*my_printk)("Hello, world!\n");

它工作正常.我使用这种方法来找到 security_file_mmap security_file_mprotect security_bprm_check 函数.

And it works fine. I use this method to locate the security_file_mmap, security_file_mprotect, and security_bprm_check functions.

然后我用跳转到我的函数的 asm 覆盖那些函数以进行TPE检查.问题是,由于已被完全劫持,当前加载的LSM将不再执行代码,因为它已与该函数挂钩.

I then overwrite those functions with an asm jump to my function to do the TPE check. The problem is, the currently loaded LSM will no longer execute the code for it's hook to that function, because it's been totally hijacked.

以下是我的工作示例:

int tpe_security_bprm_check(struct linux_binprm *bprm) {

    int ret = 0;

    if (bprm->file) {
            ret = tpe_allow_file(bprm->file);
            if (IS_ERR(ret))
                    goto out;
    }

#if WRAP_SYSCALLS
    stop_my_code(&cs_security_bprm_check);

    ret = cs_security_bprm_check.ptr(bprm);

    start_my_code(&cs_security_bprm_check);
#endif

    out:

    return ret;
}

请注意 #if WRAP_SYSCALLS 部分之间的部分(默认情况下定义为0).如果设置为1,则会调用LSM的钩子,因为我通过 asm 跳转回写了原始代码并调用了该函数,但是偶尔会遇到带有无效操作码"的OOPS内核:

Notice the section between the #if WRAP_SYSCALLS section (it's defined as 0 by default). If set to 1, the LSM's hook is called because I write the original code back over the asm jump and call that function, but I run into an occasional kernel OOPS with an "invalid opcode":

invalid opcode: 0000 [#1] SMP 
RIP: 0010:[<ffffffff8117b006>]  [<ffffffff8117b006>] security_bprm_check+0x6/0x310

我不知道问题是什么.我尝试了几种不同类型的锁定方法(有关详细信息,请参见 start/stop_my_code 的内部),无济于事.要触发内核OOPS,请编写一个简单的bash while循环,该循环无休止地启动后台的"ls"命令.一分钟左右后,它会发生.

I don't know what the issue is. I've tried several different types of locking methods (see the inside of start/stop_my_code for details) to no avail. To trigger the kernel OOPS, write a simple bash while loop that endlessly starts a backgrounded "ls" command. After a minute or so, it'll happen.

我正在RHEL6内核上对此进行测试,也可以在Ubuntu 10.04 LTS(2.6.32 x86_64)上运行.

I'm testing this on a RHEL6 kernel, also works on Ubuntu 10.04 LTS (2.6.32 x86_64).

虽然此方法迄今为止是最成功的方法,但我尝试了另一种方法,将内核函数简单地复制到使用 kmalloc 创建的指针,但是当我尝试执行该方法时,我得到了: 内核试图执行受NX保护的页面-尝试利用吗? (uid:0).如果有人可以告诉我如何kmalloc空间并将其标记为可执行文件,那也将帮助我解决上述问题.

While this method has been the most successful so far, I have tried another method of simply copying the kernel function to a pointer I created with kmalloc but when I try to execute it, I get: kernel tried to execute NX-protected page - exploit attempt? (uid: 0). If anyone can tell me how to kmalloc space and have it marked as executable, that would also help me solve the above problem.

感谢您的帮助!

推荐答案

1.在调用该函数之前,security_bprm_check()的开头似乎没有完全恢复. oops发生在security_bprm_check+0x6处,即在您放置了跳转之后,因此看起来跳转的某些部分仍在那一刻.我现在不能说为什么会发生这种情况.

1.It seems, the beginning of security_bprm_check() is not restored completely before the function is called. The oops happens at security_bprm_check+0x6, i.e. right after the jump you placed there, so it seems, some part of the jump is still there at that moment. I cannot say now why this can happen.

看看内核探针(KProbes)的实现 x86,它可能会给您一些提示.有关详细信息,另请参见 KProbes描述. KProbes需要以安全的方式修补和还原几乎任意的内核代码,以完成其工作.

Take a look at the implementation of Kernel Probes (KProbes) on x86, it may give you some hints. See also the description of KProbes for details. KProbes need to patch and restore almost arbitrary pieces of kernel code in a safe way to do their work.

2.现在,您提到的另一种方法涉及到函数的复制.以下内容有些骇人听闻,内核开发人员会对此表示反对,但是如果没有其他方法,这可能会有所帮助.

2.Now to the other approach that you mentioned, concerning copying of the function. The following is a bit of a hack and would be frowned upon by the kernel developers but if there is no other way, this might help.

您可以分配内存以将功能复制到与分配内核模块代码的内存相同的区域.该区域默认情况下应该是可执行的.同样,KProbes使用此技巧来分配其绕行缓冲区.

You can allocate memory to copy the functions to from the same area where the memory for the code of the kernel modules is allocated. That area should be executable by default. Again, KProbes use this trick to allocate their detour buffers.

内存由module_alloc()功能分配,并由module_free()释放.这些功能当然不会导出,但是您可以按照对security_file_mmap()等进行查找的方式找到它们的地址.出于好奇,您正在使用kallsyms_on_each_symbol(),对吗?

Memory is allocated by module_alloc() function and freed by module_free(). These functions are of course not exported but you can find their addresses in the same way as you do for security_file_mmap(), etc. Just of curiosity, you are using kallsyms_on_each_symbol(), right?

如果以这种方式分配内存,这还可以帮助避免另一个不太明显的问题.在x86-64上,可用于kmalloc和模块代码的内存地址区域彼此相距很远(请参阅

If you allocate memory this way, this could also help avoid another not so obvious problem. On x86-64, the memory address areas available for kmalloc and for the modules' code are located quite far away from each other (see Documentation/x86/x86_64/mm.txt), beyond the reach of any relative jump. If the memory is mapped to the modules' address area, you can use near relative jumps and calls to call the copied functions. A similar problem with RIP-relative addressing is also avoided this way.

编辑:请注意,在x86上,如果将一段代码复制到另一个内存区域中并且希望在其中运行,则可能需要对该代码进行一些更改.至少您需要修正相对调用并跳转,以将控制权转移到复制代码之外(例如,对另一个函数的调用等)以及具有RIP相对寻址的指令.

Note that on x86, if you copy some piece of code to a different memory area and you would like it to run there, some changes in that code may be necessary. At least you need to fixup the relative calls and jumps that transfer control outside of the copied code (e.g. the calls to another function, etc.) as well as the instructions with RIP-relative addressing.

除此之外,代码中可能还需要修复其他结构.例如,编译器可能已经优化了某些甚至所有switch语句以通过表进行跳转.也就是说,每个case的代码块的地址都保存在内存中的表中,而switch变量是该表中的索引.这样,您的模块将执行诸如jmp <table_start>(%reg, N)之类的操作,而不是进行许多比较(N是指针的大小,以字节为单位).也就是说,只需跳转到表中适当元素中的地址即可.由于此类表是在复制代码之前为代码创建的,因此可能需要进行修复,否则,此类跳转会将执行返回到原始代码段,而不是复制的代码段.

Apart from that, there may be other structures in the code that need to be fixed up. For example, the compiler might have optimized some or even all switch statements to a jump via a table. That is, the addresses of the code blocks for each case are kept in a table in the memory and the switch variable is the index into that table. This way, instead of many comparisons, your module will execute something like jmp <table_start>(%reg, N) (N is the size of a pointer, in bytes). That is, just a jump to an address that is in the appropriate element of the table. Because such tables are created for the code before you copy it, fixup may be necessary otherwise such jumps will take the execution back to the original piece of code rather than the copied one.

这篇关于在Linux内核中包装函数时遇到麻烦的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆