如何在Linux内核中访问进程的内核堆栈? [英] How to access a process's kernel stack in linux kernel?

查看:246
本文介绍了如何在Linux内核中访问进程的内核堆栈?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试监视进程在执行过程中正在调用哪些功能.我的目的是要知道一个流程在每个功能上花费了多少时间.函数被压入堆栈,并在函数调用返回时弹出.我想知道推送和弹出实际上发生在内核代码中的什么地方.

I am trying to monitor which functions are being called up by a process during its course of execution. My aim is to know how much time a process spends in every function. The functions are pushed over a stack and popped when function call returns. I would like to know where in the kernel code this push and pop actually happens.

我在task_struct中找到了一个void *stack字段.我不确定这是否是我要寻找的领域.如果是,那么如何知道它如何更新?

I found a void *stack field in task_struct. I am not sure if this is the field I am looking for. If it is, then what is the way to know how it is updated?

我必须编写一个将使用此代码的模块.在这种情况下,请帮助我.

I have to write a module that will make use of this code. Please help me in this case.

推荐答案

函数被压入堆栈,并在函数调用返回时弹出.我想知道推送和弹出实际上发生在内核代码中的什么地方.

The functions are pushed over a stack and popped when function call returns. I would like to know where in the kernel code this push and pop actually happens.

它不是在内核代码中发生,而是由处理器完成的. IE.当x86汇编CPU找到call指令时,它将IP压入堆栈,而ret指令将弹出该值.

It doesn't happen in kernel code, it is done by processor. I.e. when x86 assembly CPU finds call instruction, it pushes IP onto stack, while ret instruction will pop that value.

您可以使用call my_tracing_routine修补内核中的每个callret指令,并在其中记录指令指针,然后将控制权传递给原始被调用者/调用者.有相应的工具: LTTng SystemTap 以及内核内接口(如kprobes,ftrace ...),这种方法称为跟踪.

You can patch every call and ret instructions in kernel with call my_tracing_routine and record instruction pointer there, than pass control to original callee/caller. There are tools for that: LTTng, SystemTap, and in-kernel interfaces like kprobes, ftrace... This approach called tracing.

但是,如果按照补丁 all 的说明进行操作,即使用SystemTap探针kernel.function("*"),则会导致性能下降,甚至可能导致系统死机.因此,您无法测量每个函数调用,但是可以测量每个 Nth 函数调用,并希望获得相同的结果,但需要大的 sample (即运行程序几分钟)-称为分析.

But if patch all instructions, i.e. with SystemTap probe kernel.function("*"), you will kill performance, and probably system panic. So, you can't measure every function call, but you can measure every Nth function call, and hope that you will get equivalent results, but you will need large sample (i.e run program for couple of minutes) -- that is called profiling.

Linux附带了探查器perf:

Linux is shipped with profiler perf:

# perf record -- dd if=/dev/zero of=/dev/null
...
^C

# perf report
9.75%  dd  [kernel.kallsyms]  [k] __clear_user
6.69%  dd  [kernel.kallsyms]  [k] __audit_syscall_exit
5.61%  dd  [kernel.kallsyms]  [k] fsnotify
4.73%  dd  [kernel.kallsyms]  [k] system_call_after_swapgs
4.37%  dd  [kernel.kallsyms]  [k] system_call
...

您也可以使用-g收集呼叫链.默认情况下,perf使用CPU性能计数器,因此在N个CPU周期后,引发中断,并且perf处理程序(已嵌入到内核中)保存IP.

You may also use -g to collect call chains. By default perf uses CPU performance counters, so after N CPU cycles, interrupt is raised, and perf handler (it is already embedded into kernel) saves IP.

如果您希望收集堆栈,则可以使用SystemTap进行:

If you wish to collect stacks, you may do that with SystemTap:

# stap --all-modules -e '
    probe timer.profile { 
        if(execname() == "dd") { 
            println("----"); 
            print_backtrace(); } 
        }' -c 'dd if=/dev/zero of=/dev/null' 
...
    ----
0xffffffff813e714d : _raw_spin_unlock_irq+0x32/0x3c [kernel]
0xffffffff81047bb9 : spin_unlock_irq+0x9/0xb [kernel]
0xffffffff8104ac68 : get_signal_to_deliver+0x4f0/0x528 [kernel]
0xffffffff8100216f : do_signal+0x48/0x4b1 [kernel]
0xffffffff81002608 : do_notify_resume+0x30/0x63 [kernel]
0xffffffff813edd6a : int_signal+0x12/0x17 [kernel]

在此示例中,SystemTap使用timer.profile探针将其附加到perf事件cpu-clock.为此,它将生成,构建和加载内核模块.您可以使用stap -k -p 3

In this example SystemTap uses timer.profile probe which attaches to a perf event cpu-clock. To do so, it generates, builds and loads kernel module. You may check that with stap -k -p 3

这篇关于如何在Linux内核中访问进程的内核堆栈?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆