用ftrace和kprobe捕获用户空间组合（通过使用虚拟地址转换）？ [英] Capturing user-space assembly with ftrace and kprobes (by using virtual address translation)?

查看：1388 发布时间：2017/4/18 5:03:28 linux debugging linux-kernel ftrace

本文介绍了用ftrace和kprobe捕获用户空间组合（通过使用虚拟地址转换）？的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

对于长篇职位抱歉，我以较短的方式解决问题。此外，也许这更适合Unix& Linux Stack Exchange，但是我先在这里尝试一下，因为有一个 ftrace 标签。

无论如何 - 我想观察使用 ftrace function_graph 捕获的上下文中执行用户程序的机器指令>。一个问题是我需要一个较旧的内核：

  $ uname -a 
 Linux mypc 2.6.38- 16-generic＃67-Ubuntu SMP Thu Sep 6 18:00:43 UTC 2012 i686 i686 i386 GNU / Linux

...在这个版本中，没有 UPROBES - 其中，作为 3.5在[LWN.net] 中的Uprobes应该能够做这样的事情。（_{只要我不需要修补原始内核，我会愿意尝试用树构建的内核模块，如用户空间探测器（uprobes）[chunghwan.com] 似乎证明;但是从 0：基于Inode的翻页[LWN.net] ，2.6可能需要一个完整的补丁}）

但是，在这个版本上，有一个 / sys / kernel / debug / kprobes 和 / sys / kernel / debug / tracing / kprobe_events ;和 Documentation / trace / kprobetrace.txt 意味着可以直接在地址上设置kprobe;即使我无法找到任何关于如何使用这个例子的例子。

在任何情况下，我仍然不能确定要使用的地址 - 作为一个小例子，让我们说我想跟踪 wtest.c 程序（包括在下面）的 main 函数的开始。我可以这样做来编译并获得一个机器指令汇编列表：

  $ gcc -g -O0 wtest.c -o wtest 
 $ objdump -S wtest | less 
 ... 
 08048474< main> ;: 
 int main（void）{
 8048474：55 push％ebp 
 8048475：89 e5 mov％esp， ％ebp 
 8048477：83 e4 f0和$ 0xfffffff0，％esp 
 804847a：83 ec 30 sub $ 0x30，％esp 
 804847d：65 a1 14 00 00 00 mov％gs：0x14， ％eax 
 8048483：89 44 24 2c mov％eax，0x2c（％esp）
 8048487：31 c0 xor％eax，％eax 
 char filename [] =/ tmp / wtest。文本; 
 ... 
 return 0; 
 804850a：b8 00 00 00 00 mov $ 0x0，％eax 
} 
 ...

我将通过脚本设置ftrace记录：

  sudo bash -c'
 KDBGPATH =/ sys / kernel / debug / tracing
 echo function_graph> $ KDBGPATH / current_tracer 
 echo funcgraph-abstime> $ KDBGPATH / trace_options 
 echo funcgraph-proc> $ KDBGPATH / trace_options 
 echo 0> $ KDBGPATH / tracing_on 
 echo> $ KDBGPATH / trace 
 echo 1> $ KDBGPATH / tracing_on; ./wtest;回波0> $ KDBGPATH / tracing_on 
 cat $ KDBGPATH / trace> wtest.ftrace 
'

您可以看到一部分（否则复杂） code> ftrace 登录调试 - 观察内核空间中的硬盘写入（带驱动程序/模块） - Unix& Linux堆栈交换（我从中得到例子）。

基本上，我想在这个 ftrace log，当 main 的第一条指令说，0x8048474，0x8048475，0x8048477，0x804847a，0x804847d，0x8048483和0x8048487的指令由（任何）CPU。问题是，据我所知，从解剖学的内存计划：Gustavo Duarte ，这些地址是虚拟地址，从进程本身的角度来看（我收集，同样的视角由 / proc / PID / maps ）...显然，对于 krpobe_event 我需要一个物理地址？

因此，我的想法是：如果我可以找到与程序反汇编的虚拟地址相对应的物理地址（比如通过编码内核模块，这将接受pid和address，以及通过procfs返回物理地址），我可以通过上述脚本中的 / sys / kernel / debug / tracing / kprobe_events 将地址设置为tracepoints希望将它们放在 ftrace 日志中。这个工作原则上可以吗？

有一个问题，我发现在 Linux（ubuntu），C语言：虚拟到物理地址转换 - 堆栈溢出：

在用户代码中，您不能知道与虚拟地址对应的物理地址。这是信息根本不会导出到内核之外。甚至可以随时更改，特别是如果内核决定更换部分进程的内存。

...

使用systemcall / procfs将虚拟地址传递给内核并使用vmalloc_to_pfn。通过procfs / register返回物理地址。

然而， vmalloc_to_pfn t似乎是微不足道的：

x86 64 - vmalloc_to_pfn在Linux 32系统上返回32位地址。为什么会切断较高位的PAE物理地址？ - 堆栈溢出

VA：0xf8ab87fc PA使用vmalloc_to_pfn：0x36f7f7fc。但实际上我正在期待：0x136f7f7fc。

...

物理地址介于4到5 GB之间。但是我不能得到准确的物理地址，我只得到了32位地址。有没有另外一种获取真实地址的方法？

所以，我不知道我可以提取物理地址的可靠性，所以他们被kprobes追踪 - 特别是因为甚至可以随时改变。但是在这里，我希望，由于程序很小而且微不足道，所以程序在被跟踪时不会交换的合理机会，从而可以获得适当的捕获。（_{所以即使我必须多次运行调试脚本，只要我希望获得一次正确捕获一次10次（甚至100次），我会很好}

请注意，我想通过 ftrace 输出，以便时间戳在同一个域中（见可靠的Linux内核时间戳（或调整它）与usbmon和ftrace？ - Stack Overflow ，以说明时间戳的问题）。因此，即使我可以想出一个 gdb 脚本，要从用户空间运行和跟踪程序（同时一个 ftrace capture获取） - 我想避免，因为 gdb 本身的开销将显示在 ftrace 日志。

总之：

是从虚拟（从可执行程序的反汇编）获取（可能通过单独的内核模块）物理地址的方法 - 所以它们被用来触发由ftrace记录的kprobe_event值得追求的？如果是这样，有没有可以用于这个地址转换目的的内核模块的例子？

我可否使用内核模块来注册回调/处理函数内存地址正在执行？然后我可以使用该函数中的 trace_printk 来设置一个 ftrace log（或者甚至没有这个，处理函数名称本身应该显示在 ftrace 日志中），而且似乎没有太多的开销...

11。 Uprobes示例

Documentation / uprobes.txt

linux / uprobes.h

/usr/src/linux-headers-2.6.38-16/include/linux/ kprobes.h

systemtap

CONFIG_UTRACE

这个评论

wtest.c ：

  #include< stdio.h> 
 #include< fcntl.h> // O_CREAT，O_WRONLY，S_IRUSR 
 
 int main（void）{
 char filename [] =/tmp/wtest.txt; 
 char buffer [] =abcd; 
 int fd; 
 mode_t perms = S_IRUSR | S_IWUSR | S_IRGRP | S_IWGRP | S_IROTH | S_IWOTH; 
 
 fd = open（filename，O_RDWR | O_CREAT，perms）; 
 write（fd，buffer，4）; 
 close（fd）; 
 
 return 0; 
}

解决方案

显然，这将是很多内置的优化器更容易内核3.5+;但是鉴于我的内核2.6.38的提示是一个非常深入的补丁（我不能真正孤立在一个单独的内核模块中，以避免修补内核），这里是我可以注意到一个独立的模块在2.6.38。（由于我仍然不确定很多事情，我仍然希望看到一个答案可以纠正这篇文章中的任何误解。）

我想我有一个地方，但不是与 kprobes 。我不确定，但似乎我设法获得正确的物理地址;但是，使用 @ADDR：在ADDR处寻找内存（ADDR应该在内核中）; cprofile 并且我获得的物理地址低于内核边界0xc0000000（但是，然后，0xc0000000通常与虚拟内存布局一起）。

所以我用了一个硬件断点 - 模块在下面，但是注意事项 - 它的行为是随机的，偶尔会导致一个内核哎呀！通过编译模块，并运行在 bash 中：

  $ sudo bash -c'KDBGPATH =/ sys / kernel / debug / tracing; 
 echo function_graph> $ KDBGPATH / current_tracer;回声功能拍摄> $ KDBGPATH / trace_options 
 echo funcgraph-proc> $ KDBGPATH / trace_options;回声8192> $ KDBGPATH / buffer_size_kb; 
 echo 0> $ KDBGPATH / tracing_on;回声> $ KDBGPATH / trace'
 $ sudo insmod ./callmodule.ko&&睡眠0.1&& sudo rmmod callmodule&& \ 
 tail -n25 / var / log / syslog | tee log.txt&&& \ 
 sudo cat / sys / kernel / debug / tracing / trace>> log.txt

...我收到一个日志。我想跟踪 main（）的前两个指令，它们是 wtest ，这对我来说是：

  $ objdump -S wtest / wtest | grep -A3'int main'
 int main（void）{
 8048474：55 push％ebp 
 8048475：89 e5 mov％esp，％ebp 
 8048477：83 e4 f0和$ 0xfffffff0，％esp

...在虚拟地址0x08048474和0x08048475。在 syslog 输出中，我可以得到，说：

  .. 。
 [1106.383011] callmodule：父任务a：f40a9940 c：kworker / u：1 p：[14] s：停止
 [1106.383017] callmodule： -  wtest [9404] 
 [1106.383023 ] callmodule：试图走页表; addr任务0xEAE90CA0  - > mm  - > start_code：0x08048000  - > end_code：0x080485F4 
 [1106.383029] callmodule：walk_ 0x8048000 callmodule：有效pgd：有效pud：有效pmd：page frame struct是@ f63e5d80; * virtual（page_address）@（null）（is_vmalloc_addr 0 virt_addr_valid 0 virt_to_phys 0x40000000）page_to_pfn 639ec page_to_phys 0x639ec000 
 [1106.383049] callmodule：walk_ 0x80483c0 callmodule：有效pgd：有效pud：有效pmd：page frame struct是@ f63e5d80; * virtual（page_address）@（null）（is_vmalloc_addr 0 virt_addr_valid 0 virt_to_phys 0x40000000）page_to_pfn 639ec page_to_phys 0x639ec000 
 [1106.383067] callmodule：walk_ 0x8048474 callmodule：有效的pgd：有效的pud：有效的pmd：page frame struct是@ f63e5d80; * virtual（page_address）@（null）（is_vmalloc_addr 0 virt_addr_valid 0 virt_to_phys 0x40000000）page_to_pfn 639ec page_to_phys 0x639ec000 
 [1106.383083] callmodule：physaddr：（0x080483c0  - >）0x639ec3c0：（0x08048474  - >）0x639ec474 
 [1106.383106] callmodule：0x08048474 id [3] 
 [1106.383113] callmodule：0x08048475 id [4] 
 [1106.383118] callmodule：（（0x08048000 is_vmalloc_addr 0 virt_addr_valid 0））
 [1106.383130] callmodule：cont pid task a：eae90ca0 c：wtest p：[9404] s：runnable 
 [1106.383147] initcall callmodule_init + 0x0 / 0x1000 [callmodule]返回与抢占不平衡
 [1106.518074] callmodule：退出

...意味着它将虚拟地址0x08048474映射到物理地址0x639ec474。但是，物理不用于硬件断点 - 我们可以直接向 register_user_hw_breakpoint 提供虚拟地址;但是，我们还需要提供过程的 task_struct 。这样，我可以在 ftrace 输出中获得这样的东西：

  ... 
 597.907256 | 1）wtest-5339 | | handle_mm_fault（）{
 ... 
 597.907310 | 1）wtest-5339 | + 35.627 us | } 
 597.907311 | 1）wtest-5339 | + 46.245 us | } 
 597.907312 | 1）wtest-5339 | + 56.143 us | } 
 597.907313 | 1）wtest-5339 | 1.039 us | up_read（）; 
 597.907317 | 1）wtest-5339 | 1.285我们| native_get_debugreg（）; 
 597.907319 | 1）wtest-5339 | 1.075 us | native_set_debugreg（）; 
 597.907322 | 1）wtest-5339 | 1.129 us | native_get_debugreg（）; 
 597.907324 | 1）wtest-5339 | 1.189 us | native_set_debugreg（）; 
 597.907329 | 1）wtest-5339 | | （）{
 597.907333 | 1）wtest-5339 | | / * callmodule：hwbp hit：id [3] * / 
 597.907334 | 1）wtest-5339 | 5.567我们| } 
 597.907336 | 1）wtest-5339 | 1.123我们| native_set_debugreg（）; 
 597.907339 | 1）wtest-5339 | 1.130我们| native_get_debugreg（）; 
 597.907341 | 1）wtest-5339 | 1.075 us | native_set_debugreg（）; 
 597.907343 | 1）wtest-5339 | 1.075 us | native_get_debugreg（）; 
 597.907345 | 1）wtest-5339 | 1.081 us | native_set_debugreg（）; 
 597.907348 | 1）wtest-5339 | | （）{
 597.907350 | 1）wtest-5339 | | / * callmodule：hwbp hit：id [4] * / 
 597.907351 | 1）wtest-5339 | 3.033 us | } 
 597.907352 | 1）wtest-5339 | 1.105我们| native_set_debugreg（）; 
 597.907358 | 1）wtest-5339 | 1.315我们| down_read_trylock（）; 
 597.907360 | 1）wtest-5339 | 1.123我们| _cond_resched（）; 
 597.907362 | 1）wtest-5339 | 1.027 us | find_vma（）; 
 597.907364 | 1）wtest-5339 | | handle_mm_fault（）{
 ...

...其中对应于组件的跟踪被标记为断点ID。幸运的是，正如预期的那样，他们是正确的;但是， ftrace 也捕获了一些调试命令。无论如何，这是我想看的。

以下是有关该模块的一些说明：

大部分模块来自
执行/调用用户空间程序，并从内核模块获取其pid;开始用户进程并获取pid
- 由于我们必须到达task_struct以获取pid;在这里我保存两个（这是一种冗余的）

函数符号不导出的地方;如果符号在 kallsyms 中，那么我使用一个函数指针到地址;否则其他所需的功能从源代码复制

我不知道如何启动用户空间进程停止，所以在产生后，我发出一个 SIGSTOP （在这一点上看起来似乎不太可靠），并将状态设置为 __ TASK_STOPPED ）。

我可能仍然会获得状态runnable，我不期待有时 - 然而，如果init早期退出错误，我注意到 wtest 挂在进程列表中很久以后就会自然终止，所以我猜这个工程。

要获取绝对/实体地址，我使用了进程的行进页表在Linux 中，到达与虚拟地址相对应的页面，然后通过内核源码挖掘我发现 page_to_phys（）以获取地址（内部通过页面帧号）; LDD3 ch.15有助于理解pfn和物理地址之间的关系。
- 由于这里我希望有物理地址，我不使用PAGE_SHIFT，而是直接从 objdump 的汇编输出 - 我不是100％肯定这是正确的。
- 注意（另见如何从Linux内核中的任何地址获取结构页面），模块输出表示虚拟地址 0x08048000 既不是 is_vmalloc_addr 也不是$ code> virt_addr_valid ;我想这应该告诉我，不能使用 vmalloc_to_pfn（）或 virt_to_page（）来获取到你的实际地址！

设置 kprobes ftrace 从内核空间有点棘手（需要复制功能）
- 试图设置一个 kprobe 我获得的物理地址（例如0x639ec474），总是导致无法插入探测器（-22）
- 只是为了看看格式被解析，我正在尝试使用下面的 tracing_on（）函数（0xc10bcf60）的 kallsyms 地址。这似乎是有效的 - 因为它引发了一个致命的BUG：调度原子（显然，我们不是要在module_init中设置断点）。错误是致命的，因为它使 kprobes 目录从 ftrace 调试目录
- 只需创建 kprobe 就不会使它出现在 ftrace 日志中 - 它还需要启用;启用的必要代码是 - 但是我从来没有尝试过，因为以前的错误

最后，断点设置是来自
观看Linux内核中的变量（内存地址）更改，更改时打印堆栈跟踪？
- 我从未看到设置可执行硬件断点的示例;对于我来说，它一直失败，直到通过内核源代码搜索，我发现对于 HW_BREAKPOINT_X ， attr.bp_len 需要设置为 sizeof（long）
- 如果我尝试 printk code> attr 变量 - 从_init或从处理程序 - 某些东西被认真搞砸了，下面我尝试打印的任何变量，我得到的值为0x5（或0x48）（？！）
- 由于我试图对两个断点使用单个处理函数，唯一可靠的信息从_init到处理程序，能够区分两个，似乎是 bp-> id
- 这些id是自动分配的，似乎如果您取消注册，则不会重新声明断点（我不注销它们以避免额外的追踪打印输出）。

随着随机性的发展，我认为这是因为这个过程不是在停止状态下开始的;并且在它被停止的时候，它以不同的状态结束（或者很可能我在某处失去了一些锁定）。无论如何，您还可以在 syslog 中预期：

  [1661.815114] callmodule ：试图走页表; addr任务0xEAF68CA0  - > mm  - > start_code：0x08048000  - > end_code：0x080485F4 
 [1661.815319] callmodule：walk_ 0x8048000 callmodule：有效pgd：有效pud：有效pmd：page frame struct是@ f5772000; * virtual（page_address）@ c0000000（is_vmalloc_addr 0 virt_addr_valid 1 virt_to_phys 0x0）page_to_pfn 0 page_to_phys 0x0 
 [1661.815837] callmodule：walk_ 0x80483c0 callmodule：有效的pgd：有效的pud：有效的pmd：page frame struct是@ f5772000; * virtual（page_address）@ c0000000（is_vmalloc_addr 0 virt_addr_valid 1 virt_to_phys 0x0）page_to_pfn 0 page_to_phys 0x0 
 [1661.816846] callmodule：walk_ 0x8048474 callmodule：有效pgd：有效pud：有效pmd：页框结构是@ f5772000; * virtual（page_address）@ c0000000（is_vmalloc_addr 0 virt_addr_valid 1 virt_to_phys 0x0）page_to_pfn 0 page_to_phys 0x0

也就是说，即使使用正确的任务指针（由start_code判断），只获得0x0作为物理地址。有时你得到相同的结果，但是使用 start_code：0x00000000 - > end_code：0x00000000 。有时，即使pid可以，也不能获得 task_struct ：

  [833.380417] callmodule：c：pid 7663 
 [833.380424] callmodule：一切都正常; pid 7663（7663）
 [833.380430] callmodule：p为空 - 退出
 [833.516160] callmodule：<退出

嗯，希望有人会评论并澄清这个模块的一些行为：）

希望这有助于某人，

干杯！

Makefile ：

  EXTRA_CFLAGS = -g -O0 
 obj-m + callmodule.o 
 all：
 make -C / lib / modules / $（shell uname -r）/ build M = $（PWD）modules 
 clean：
 make -C / lib / modules / $（shell uname -r）/ build M = $（PWD）clean

callmodule.c ：

  #include < linux / module.h> 
 #include  #include  #include< linux / kallsyms.h> // kallsyms_lookup，print_symbol 
 #include< linux / highmem.h> //'kmap_atomic'（via pte_offset_map）
 #include< asm / io.h> // page_to_phys（arch / x86 / include / asm / io.h）
 
 struct subprocess_infoB; // forward declare 
 //全局变量 - 避免在call_usermodehelperB的返回中插入太多：
 static int callmodule_pid; 
 static struct subprocess_infoB * callmodule_infoB; 
 #define TRY_USE_KPROBES 0 // 1 //启用/禁用kprobes使用代码
 #include< linux / kprobes.h> // enable_kprobe 
 //对于硬件断点：$ b $ b #include< linux / perf_event.h> 
 #include< linux / hw_breakpoint.h> 
 
 //定义一个修改后的结构（带有额外的字段）：
 struct subprocess_infoB {
 struct work_struct work; 
 struct completion * complete; 
 char * path; 
 char ** argv; 
 char ** envp; 
 int wait; // enum umh_wait等
 int retval; 
 int（* init）（struct subprocess_info * info）; 
 void（* cleanup）（struct subprocess_info * info）; 
 void * data; 
 pid_t pid; 
 struct task_struct *任务; 
 unsigned long long last_page_physaddr; 
}; 
 
 struct subprocess_infoB * call_usermodehelper_setupB（char * path，char ** argv，
 char ** envp，gfp_t gfp_mask）; 
 
 static inline int 
 call_usermodehelper_fnsB（char * path，char ** argv，char ** envp，
 int wait，// enum umh_wait wait，
 int * init）（struct subprocess_info * info）
 void（* cleanup）（struct subprocess_info *），void * data）
 {
 struct subprocess_info * info; 
 struct subprocess_infoB * infoB; 
 gfp_t gfp_mask =（wait == UMH_NO_WAIT）？ GFP_ATOMIC：GFP_KERNEL; 
 int ret; 
 
 populate_rootfs_wait（）; 
 
 infoB = call_usermodehelper_setupB（path，argv，envp，gfp_mask）; 
 printk（KBUILD_MODNAME：a：pid％d\\\
，infoB-> pid）; 
 info =（struct subprocess_info *）infoB; 
 
 if（info == NULL）
 return -ENOMEM; 
 
 call_usermodehelper_setfns（info，init，cleanup，data）; 
 printk（KBUILD_MODNAME：b：pid％d\\\
，infoB-> pid）; 
 
 //必须首先调用infoB-> pid之前（__call_usermodehelperB）：
 ret = call_usermodhelper_exec（info，wait）; 
 
 //在这里分配全局pid（和infoB），所以剩下的代码有：
 callmodule_pid = infoB-> pid; 
 callmodule_infoB = infoB; 
 printk（KBUILD_MODNAME：c：pid％d\\\
，callmodule_pid）; 
 
 return ret; 
} 
 
 static inline int 
 call_usermodehelperB（char * path，char ** argv，char ** envp，int wait）// enum umh_wait wait）
 { 
 return call_usermodehelper_fnsB（path，argv，envp，wait，
 NULL，NULL，NULL）; 
} 
 
 static void __call_usermodehelperB（struct work_struct * work）
 {
 struct subprocess_infoB * sub_infoB = 
 container_of（work，struct subprocess_infoB，work）; 
 int wait = sub_infoB-> wait; // enum umh_wait wait = sub_info-> wait; 
 pid_t pid; 
 struct subprocess_info * sub_info; 
 // hack  - 声明函数指针
 int（* ptrwait_for_helper）（void * data）; 
 int（* ptr ____ call_usermodehelper）（void * data）; 
 //将函数指针从/ proc / kallsyms获取的逐字节地址
 int killret; 
 struct task_struct * spawned_task; 
 ptrwait_for_helper =（void *）0xc1065b60; 
 ptr____call_usermodehelper =（void *）0xc1065ed0; 
 
 sub_info =（struct subprocess_info *）sub_infoB; 
 
 if（wait == UMH_WAIT_PROC）
 pid = kernel_thread（（* ptrwait_for_helper），sub_info，//（wait_for_helper，sub_info，
 CLONE_FS | CLONE_FILES | SIGCHLD）; 
 else 
 pid = kernel_thread（（* ptr____call_usermodehelper），sub_info，//（____ call_usermodehelper，sub_info，
 CLONE_VFORK | SIGCHLD）; 
 
 spawned_task = pid_task（find_vpid ，PIDTYPE_PID）; 
 
 //停止/暂停/暂停任务
 killret = kill_pid（find_vpid（pid），SIGSTOP，1）; 
 if（spawned_task！= NULL）{ 
 //这是否停止了进程？
 spawned_task-> state = __TASK_STOPPED; 
 printk（KBUILD_MODNAME：：exst％d exco％d exsi％d diex％d inex％d inio％d\\\
，spawned_task-> exit_state，spawned_task-> exit_code，spawned_task-> exit_signal，spawned_task-> did_exec，spawned_task-> in_execve，spawned_task-> in_iowait）; 
} 
 printk（KBUILD_MODNAME：：（kr：％d）\\\
，killret）; 
 prin tk（KBUILD_MODNAME：：pid％d（％p）（％s）\\\
，pid，spawned_task，
（spawned_task！= NULL）？（（spawned_task-> state ==  -  1）？ unrunnable：（（spawned_task-> state == 0）？runnable：stopped））：null）; 
 //抓取并保存pid（和task_struct）：
 sub_infoB-> pid = pid; 
 sub_infoB-> task = spawned_task; 
 switch（wait）{
 case UMH_NO_WAIT：
 call_usermodehelper_freeinfo（sub_info）; 
 break; 
 case UMH_WAIT_PROC：
 if（pid> 0）
 break; 
 / * FALLTHROUGH * / 
 case UMH_WAIT_EXEC：
 if（pid< 0）
 sub_info-> retval = pid; 
完成（sub_info->完成）; 
} 
} 
 
 struct subprocess_infoB * call_usermodehelper_setupB（char * path，char ** argv，
 char ** envp，gfp_t gfp_mask）
 {
 struct subprocess_infoB * sub_infoB; 
 sub_infoB = kzalloc（sizeof（struct subprocess_infoB），gfp_mask）; 
 if（！sub_infoB）
 goto out; 
 
 INIT_WORK（& sub_infoB-> work，__call_usermodehelperB）; 
 sub_infoB-> path = path; 
 sub_infoB-> argv = argv; 
 sub_infoB-> envp = envp; 
 out：
 return sub_infoB; 
} 
 
 #if TRY_USE_KPROBES 
 //从/kernel/trace/trace_probe.c（未取消导出）复制
 int traceprobe_command（const char * buf，int * createfn）（int，char **））
 {
 char ** argv; 
 int argc，ret; 
 
 argc = 0; 
 ret = 0; 
 argv = argv_split（GFP_KERNEL，buf，& argc）; 
 if（！argv）
 return -ENOMEM; 
 
 if（argc）
 ret = createfn（argc，argv）; 
 
 argv_free（argv）; 
 
 return ret; 
} 
 
 //从内核/ trace / trace_kprobe.c复制v = 2.6.38（未解除）
 #define TP_FLAG_TRACE 1 
 #define TP_FLAG_PROFILE 2 
 typedef void（* fetch_func_t）（struct pt_regs *，void *，void *）; 
 struct fetch_param {
 fetch_func_t fn; 
 void * data; 
}; 
 typedef int（* print_type_func_t）（struct trace_seq *，const char *，void *，void *）; 
 enum {
 FETCH_MTD_reg = 0，
 FETCH_MTD_stack，
 FETCH_MTD_retval，
 FETCH_MTD_memory，
 FETCH_MTD_symbol，
 FETCH_MTD_deref，
 FETCH_MTD_END， 
}; 
 //获取类型信息表* / 
 struct fetch_type {
 const char * name; / *类型名称* / 
 size_t size; / *类型的字节大小* / 
 int is_signed; / *签名标志* / 
 print_type_func_t print; / *打印函数* / 
 const char * fmt; / * Fromat string * / 
 const char * fmttype; / *格式文件中的名称* / 
 //获取函数* / 
 fetch_func_t fetch [FETCH_MTD_END]; 
}; 
 struct probe_arg { 
   struct fetch_param      fetch; 
   struct fetch_param      fetch_size; 
   unsigned int            offset; /* Offset from argument entry */ 
   const char              *name; /* Name of this argument */ 
   const char              *comm; /* Command of this argument */ 
   const struct fetch_type *type; /* Type of this argument */ 
 }; 
 struct trace_probe { 
   struct list_head        list; 
   struct kretprobe        rp; /* Use rp.kp for kprobe use */ 
   unsigned long           nhit; 
   unsigned int            flags; /* For TP_FLAG_* */ 
   const char              *symbol; /* symbol name */ 
   struct ftrace_event_class       class; 
   struct ftrace_event_call        call; 
   ssize_t                 size; /* trace entry size */ 
   unsigned int            nr_args; 
   struct probe_arg        args[]; 
}; 
 static  int probe_is_return(struct trace_probe *tp) 
 { 
   return tp->rp.handler != NULL; 
 } 
 static int probe_event_enable(struct ftrace_event_call *call) 
 { 
   struct trace_probe *tp = (struct trace_probe *)call->data; 
  
   tp->flags |= TP_FLAG_TRACE; 
   if (probe_is_return(tp)) 
     return enable_kretprobe(&tp->rp); 
   else 
     return enable_kprobe(&tp->rp.kp); 
 } 
 #define KPROBE_EVENT_SYSTEM \"kprobes\" 
 #endif // TRY_USE_KPROBES 
  
 // <<<<<<<<<<<<<<<<<<<<<< 
  
 static struct page *walk_page_table(unsigned long addr, struct task_struct *intask) 
 { 
   pgd_t *pgd; 
   pte_t *ptep, pte; 
   pud_t *pud; 
   pmd_t *pmd; 
  
   struct page *page = NULL; 
   struct mm_struct *mm = intask->mm; 
  
   callmodule_infoB->last_page_physaddr = 0ULL; // reset here, in case of early exit 
  
   printk(KBUILD_MODNAME \": walk_ 0x%lx \", addr); 
  
   pgd = pgd_offset(mm, addr); 
   if (pgd_none(*pgd) || pgd_bad(*pgd)) 
     goto out; 
   printk(KBUILD_MODNAME \": Valid pgd \"); 
  
   pud = pud_offset(pgd, addr); 
   if (pud_none(*pud) || pud_bad(*pud)) 
     goto out; 
   printk( \": Valid pud\"); 
  
   pmd = pmd_offset(pud, addr); 
   if (pmd_none(*pmd) || pmd_bad(*pmd)) 
     goto out; 
   printk( \": Valid pmd\"); 
  
   ptep = pte_offset_map(pmd, addr); 
   if (!ptep) 
     goto out; 
   pte = *ptep; 
  
   page = pte_page(pte); 
   if (page) { 
     callmodule_infoB->last_page_physaddr = (unsigned long long)page_to_phys(page); 
     printk( \": page frame struct is @ %p; *virtual (page_address) @ %p (is_vmalloc_addr %d virt_addr_valid %d virt_to_phys 0x%llx) page_to_pfn %lx page_to_phys 0x%llx\", page, page_address(page), is_vmalloc_addr((void*)page_address(page)), virt_addr_valid(page_address(page)), (unsigned long long)virt_to_phys(page_address(page)), page_to_pfn(page), callmodule_infoB->last_page_physaddr); 
   } 
  
   //~ pte_unmap(ptep); 
  
 out: 
   printk(\"\n\"); 
   return page; 
 } 
  
 static void sample_hbp_handler(struct perf_event *bp, 
              struct perf_sample_data *data, 
              struct pt_regs *regs) 
 { 
   trace_printk(KBUILD_MODNAME \": hwbp hit: id [%llu]\n\", bp->id ); 
   //~ unregister_hw_breakpoint(bp); 
 } 
  
 // ---------------------- 
  
 static int __init callmodule_init(void) 
 { 
   int ret = 0; 
   char userprog[] = \"/path/to/wtest\"; 
   char *argv[] = {userprog, \"2\", NULL }; 
   char *envp[] = {\"HOME=/\", \"PATH=/sbin:/usr/sbin:/bin:/usr/bin\", NULL }; 
   struct task_struct *p; 
   struct task_struct *par; 
   struct task_struct *pc; 
   struct list_head *children_list_head; 
   struct list_head *cchildren_list_head; 
   char *state_str; 
   unsigned long offset, taddr; 
   int (*ptr_create_trace_probe)(int argc, char **argv); 
   struct trace_probe* (*ptr_find_probe_event)(const char *event, const char *group); 
   //int (*ptr_probe_event_enable)(struct ftrace_event_call *call); // not exported, copy 
   #if TRY_USE_KPROBES 
   char trcmd[256] = \"\"; 
   struct trace_probe *tp; 
   #endif //TRY_USE_KPROBES 
   struct perf_event *sample_hbp, *sample_hbpb; 
   struct perf_event_attr attr, attrb; 
  
   printk(KBUILD_MODNAME \": > init %s\n\", userprog); 
  
   ptr_create_trace_probe = (void *)0xc10d5120; 
   ptr_find_probe_event = (void *)0xc10d41e0; 
   print_symbol(KBUILD_MODNAME \": symbol @ 0xc1065b60 is %s\n\", 0xc1065b60); // shows wait_for_helper+0x0/0xb0 
   print_symbol(KBUILD_MODNAME \": symbol @ 0xc1065ed0 is %s\n\", 0xc1065ed0); // shows ____call_usermodehelper+0x0/0x90 
   print_symbol(KBUILD_MODNAME \": symbol @ 0xc10d5120 is %s\n\", 0xc10d5120); // shows create_trace_probe+0x0/0x590 
   ret = call_usermodehelperB(userprog, argv, envp, UMH_WAIT_EXEC); 
   if (ret != 0) 
       printk(KBUILD_MODNAME \": error in call to usermodehelper: %i\n\", ret); 
   else 
       printk(KBUILD_MODNAME \": everything all right; pid %d (%d)\n\", callmodule_pid, callmodule_infoB->pid); 
   tracing_on(); // earlier, so trace_printk of handler is caught! 
   // find the task: 
   rcu_read_lock(); 
   p = pid_task(find_vpid(callmodule_pid), PIDTYPE_PID); 
   rcu_read_unlock(); 
   if (p == NULL) { 
     printk(KBUILD_MODNAME \": p is NULL - exiting\n\"); 
 return 0; 
   } 
   state_str = (p->state==-1)?\"unrunnable\":((p->state==0)?\"runnable\":\"stopped\"); 
   printk(KBUILD_MODNAME \": pid task a: %p c: %s p: [%d] s: %s\n\", 
     p, p->comm, p->pid, state_str); 
   // find parent task: 
   par = p->parent; 
   if (par == NULL) { 
     printk(KBUILD_MODNAME \": par is NULL - exiting\n\"); 
 return 0; 
   } 
   state_str = (par->state==-1)?\"unrunnable\":((par->state==0)?\"runnable\":\"stopped\"); 
   printk(KBUILD_MODNAME \": parent task a: %p c: %s p: [%d] s: %s\n\", 
     par, par->comm, par->pid, state_str); 
  
   // iterate through parent’s (and our task’s) child processes: 
   rcu_read_lock(); // read_lock(&tasklist_lock); 
   list_for_each(children_list_head, &par->children){ 
     p = list_entry(children_list_head, struct task_struct, sibling); 
     printk(KBUILD_MODNAME \": - %s [%d] \n\", p->comm, p->pid); 
     if (p->pid == callmodule_pid) { 
       list_for_each(cchildren_list_head, &p->children){ 
         pc = list_entry(cchildren_list_head, struct task_struct, sibling); 
         printk(KBUILD_MODNAME \": - - %s [%d] \n\", pc->comm, pc->pid); 
       } 
     } 
   } 
   rcu_read_unlock(); //~ read_unlock(&tasklist_lock); 
  
   // NOTE: here p == callmodule_infoB->task !! 
   printk(KBUILD_MODNAME \": Trying to walk page table; addr task 0x%X ->mm ->start_code: 0x%08lX ->end_code: 0x%08lX \n\", (unsigned int) callmodule_infoB->task, callmodule_infoB->task->mm->start_code, callmodule_infoB->task->mm->end_code); 
   walk_page_table(0x08048000, callmodule_infoB->task); 
   // 080483c0 is start of .text; 08048474 start of main; for objdump -S wtest 
   walk_page_table(0x080483c0, callmodule_infoB->task); 
   walk_page_table(0x08048474, callmodule_infoB->task); 
  
   if (callmodule_infoB->last_page_physaddr != 0ULL) { 
     printk(KBUILD_MODNAME \": physaddr \"); 
     taddr = 0x080483c0; // .text 
     offset = taddr - callmodule_infoB->task->mm->start_code; 
     printk(\": (0x%08lx ->) 0x%08llx \", taddr, callmodule_infoB->last_page_physaddr+offset); 
     taddr = 0x08048474; // main 
     offset = taddr - callmodule_infoB->task->mm->start_code; 
     printk(\": (0x%08lx ->) 0x%08llx \", taddr, callmodule_infoB->last_page_physaddr+offset); 
     printk(\"\n\"); 
  
     #if TRY_USE_KPROBES // can’t use this here (BUG: scheduling while atomic, if probe inserts) 
     //~ sprintf(trcmd, \"p:myprobe 0x%08llx\", callmodule_infoB->last_page_physaddr+offset); 
     // try symbol for c10bcf60 - tracing_on 
     sprintf(trcmd, \"p:myprobe 0x%08llx\", (unsigned long long)0xc10bcf60); 
     ret = traceprobe_command(trcmd, ptr_create_trace_probe); //create_trace_probe); 
     printk(\"%s -- ret: %d\n\", trcmd, ret); 
     // try find probe and enable it (compiles, but untested): 
     tp = ptr_find_probe_event(\"myprobe\", KPROBE_EVENT_SYSTEM); 
     if (tp != NULL) probe_event_enable(&tp->call); 
     #endif //TRY_USE_KPROBES 
   } 
  
   hw_breakpoint_init(&attr); 
   attr.bp_len = sizeof(long); //HW_BREAKPOINT_LEN_1; 
   attr.bp_type = HW_BREAKPOINT_X ; 
   attr.bp_addr = 0x08048474; // main 
   sample_hbp = register_user_hw_breakpoint(&attr, (perf_overflow_handler_t)sample_hbp_handler, p); 
   printk(KBUILD_MODNAME \": 0x08048474 id [%llu]\n\", sample_hbp->id); // 
   if (IS_ERR((void __force *)sample_hbp)) { 
     int ret = PTR_ERR((void __force *)sample_hbp); 
     printk(KBUILD_MODNAME \": Breakpoint registration failed (%d)\n\", ret); 
     //~ return ret; 
   } 
  
   hw_breakpoint_init(&attrb); 
   attrb.bp_len = sizeof(long); 
   attrb.bp_type = HW_BREAKPOINT_X ; 
   attrb.bp_addr = 0x08048475; // first instruction after main 
   sample_hbpb = register_user_hw_breakpoint(&attrb, (perf_overflow_handler_t)sample_hbp_handler, p); 
   printk(KBUILD_MODNAME \": 0x08048475 id [%llu]\n\", sample_hbpb->id); //45 
   if (IS_ERR((void __force *)sample_hbpb)) { 
     int ret = PTR_ERR((void __force *)sample_hbpb); 
     printk(KBUILD_MODNAME \": Breakpoint registration failed (%d)\n\", ret); 
     //~ return ret; 
   } 
  
   printk(KBUILD_MODNAME \": (( 0x08048000 is_vmalloc_addr %d virt_addr_valid %d ))\n\", is_vmalloc_addr((void*)0x08048000), virt_addr_valid(0x08048000)); 
  
   kill_pid(find_vpid(callmodule_pid), SIGCONT, 1); // resume/continue/restart task 
   state_str = (p->state==-1)?\"unrunnable\":((p->state==0)?\"runnable\":\"stopped\"); 
   printk(KBUILD_MODNAME \": cont pid task a: %p c: %s p: [%d] s: %s\n\", 
     p, p->comm, p->pid, state_str); 
  
   return 0; 
 } 
  
 static void __exit callmodule_exit(void) 
 { 
   tracing_off(); //corresponds to the user space /sys/kernel/debug/tracing/tracing_on file 
   printk(KBUILD_MODNAME \": < exit\n\"); 
 } 
  
 module_init(callmodule_init); 
 module_exit(callmodule_exit); 
 MODULE_LICENSE(\"GPL\");

Apologies for the longish post, I'm having trouble formulating it in a shorter way. Also, maybe this is more appropriate for Unix & Linux Stack Exchange, but I'll try here at SO first, as there is an ftrace tag.

Anyways - I'd like to observe do machine instructions of a user program execute in the context of a full function_graph capture using ftrace. One problem is that I need this for an older kernel:

$ uname -a
Linux mypc 2.6.38-16-generic #67-Ubuntu SMP Thu Sep 6 18:00:43 UTC 2012 i686 i686 i386 GNU/Linux

... and in this edition, there is no UPROBES - which, as Uprobes in 3.5 [LWN.net] notes, should be able to do something like that. (_{As long as I don't have to patch the original kernel, I would be willing to try a kernel module built out of tree, as User-Space Probes (Uprobes) [chunghwan.com] seems to demonstrate; but as far as I can see from 0: Inode based uprobes [LWN.net], the 2.6 would probably need a full patch})

However, on this version, there is a /sys/kernel/debug/kprobes, and /sys/kernel/debug/tracing/kprobe_events; and Documentation/trace/kprobetrace.txt implies that a kprobe can be set directly on an address; even if I cannot find an example anywhere on how this is used.

In any case, I would still not be sure what addresses to use - as a small example, let's say I want to trace the start of the main function of the wtest.c program (included below). I can do this to compile and obtain an machine instruction assembly listing:

$ gcc -g -O0 wtest.c -o wtest
$ objdump -S wtest | less
...
08048474 <main>:
int main(void) {
 8048474:       55                      push   %ebp
 8048475:       89 e5                   mov    %esp,%ebp
 8048477:       83 e4 f0                and    $0xfffffff0,%esp
 804847a:       83 ec 30                sub    $0x30,%esp
 804847d:       65 a1 14 00 00 00       mov    %gs:0x14,%eax
 8048483:       89 44 24 2c             mov    %eax,0x2c(%esp)
 8048487:       31 c0                   xor    %eax,%eax
  char filename[] = "/tmp/wtest.txt";
...
  return 0;
 804850a:       b8 00 00 00 00          mov    $0x0,%eax
}
...

I would set up ftrace logging via this script:

sudo bash -c '
KDBGPATH="/sys/kernel/debug/tracing"
echo function_graph > $KDBGPATH/current_tracer
echo funcgraph-abstime > $KDBGPATH/trace_options
echo funcgraph-proc > $KDBGPATH/trace_options
echo 0 > $KDBGPATH/tracing_on
echo > $KDBGPATH/trace
echo 1 > $KDBGPATH/tracing_on ; ./wtest ; echo 0 > $KDBGPATH/tracing_on
cat $KDBGPATH/trace > wtest.ftrace
'

You can see a portion of the (otherwise complex) resulting ftrace log in debugging - Observing a hard-disk write in kernel space (with drivers/modules) - Unix & Linux Stack Exchange (where I got the example from).

Basically, I'd want a printout in this ftrace log, when the first instructions of main - say, the instructions at 0x8048474, 0x8048475, 0x8048477, 0x804847a, 0x804847d, 0x8048483 and 0x8048487 - are executed by (any) CPU. The problem is, as far as I can understand from Anatomy of a Program in Memory : Gustavo Duarte, these addresses are the virtual addresses, as seen from the perspective of the process itself (and I gather, the same perspective is shown by /proc/PID/maps)... And apparently, for krpobe_event I'd need a physical address?

So, my idea would be: if I can find the physical addresses corresponding to the virtual addresses of the program disassembly (say by coding a kernel module, which would accept pid and address, and return the physical address via procfs), I could set up addresses as a sort of "tracepoints" via /sys/kernel/debug/tracing/kprobe_events in the above script - and hopefully get them in the ftrace log. Could this work, in principle?

One problem with this, I found on Linux(ubuntu), C language: Virtual to Physical Address Translation - Stack Overflow:

In user code, you can't know the physical address corresponding to a virtual address. This is information is simply not exported outside the kernel. It could even change at any time, especially if the kernel decides to swap out part of your process's memory.
...
Pass the virtual address to the kernel using systemcall/procfs and use vmalloc_to_pfn. Return the Physical address through procfs/registers.

However, vmalloc_to_pfn doesn't seem to be trivial either:

x86 64 - vmalloc_to_pfn returns 32 bit address on Linux 32 system. Why does it chop off higher bits of PAE physical address? - Stack Overflow

VA: 0xf8ab87fc PA using vmalloc_to_pfn: 0x36f7f7fc. But I'm actually expecting: 0x136f7f7fc.
...
The physical address falls between 4 to 5 GB. But I can't get the exact physical address, I only get the chopped off 32-bit address. Is there another way to get true physical address?

So, I'm not sure how reliably I could extract the physical addresses so they are traced by kprobes - especially since "it could even change at any time". But here, I would hope that since the program is small and trivial, there would be a reasonable chance that the program would not swap while being traced, allowing for a proper capture to be obtained. (_{So even if I have to run the debug script above multiple times, as long as I can hope to obtain a "proper" capture once out of 10 times (or even 100 times), I'd be OK with it.}).

Note that I'd want an output through ftrace, so that the timestamps are expressed in the same domain (see Reliable Linux kernel timestamps (or adjustment thereof) with both usbmon and ftrace? - Stack Overflow for an illustration of a problem with timestamps). Thus, even if I could come up with, say, a gdb script, to run and trace the program from userspace (while simultaneously an ftrace capture is obtained) - I'd like to avoid that, as the overhead from gdb itself will show in the ftrace logs.

So, in summary:

Is the approach of obtaining (possibly through a separate kernel module) physical addresses from the virtual (from a disassembly of an executable) addresses - so they are used to trigger a kprobe_event logged by ftrace - worth pursuing? If so, are there any examples of kernel modules that can be used for this address translation purpose?
Could I otherwise use a kernel module to "register" a callback/handler function when a particular memory address is being executed? Then I could simply use a trace_printk in that function to have an ftrace log (or even without that, the handler function name itself should show in the ftrace log), and it doesn't seem there will be too much overhead with that...

Actually, in this 2007 posting, Jim Keniston - utrace-based uprobes: systemtap mailing list, there is a 11. Uprobes Example (added to Documentation/uprobes.txt), which seems to be exactly that - a kernel module registering a handler function. Unfortunately, it uses linux/uprobes.h; and I have only kprobes.h in my /usr/src/linux-headers-2.6.38-16/include/linux/. Also, on my system, even systemtap complains about CONFIG_UTRACE not being enabled (see this comment)... So if there's any other approach I could use to obtain a debug trace like I want, without having to recompile the kernel to get uprobes, it would be great to know...

wtest.c:

#include <stdio.h>
#include <fcntl.h>  // O_CREAT, O_WRONLY, S_IRUSR

int main(void) {
  char filename[] = "/tmp/wtest.txt";
  char buffer[] = "abcd";
  int fd;
  mode_t perms = S_IRUSR|S_IWUSR|S_IRGRP|S_IWGRP|S_IROTH|S_IWOTH;

  fd = open(filename, O_RDWR|O_CREAT, perms);
  write(fd,buffer,4);
  close(fd);

  return 0;
}

解决方案

Obviously, this would be much easier with built-in uprobes on kernels 3.5+; but given that uprobes for my kernel 2.6.38 is a very deep-going patch (which I couldn't really isolate in a separate kernel module, so as to avoid patching the kernel), here is what I can note for a standalone module on 2.6.38. (Since I'm still unsure of many things, I would still like to see an answer that would corrects any misunderstandings in this post.)

I think I got somewhere, but not with kprobes. I'm not sure, but it seems I managed to get physical addresses right; however, kprobes documentation is specific that when using "@ADDR : fetch memory at ADDR (ADDR should be in kernel)"; and the physical addresses I get are below kernel boundary of 0xc0000000 (but then, 0xc0000000 is usually together with the virtual memory layout?).

So I used a hardware breakpoint instead - the module is below, however caveat emptor - it behaves randomly, and occasionally can cause a kernel oops!. By compiling the module, and running in bash:

$ sudo bash -c 'KDBGPATH="/sys/kernel/debug/tracing" ;
echo function_graph > $KDBGPATH/current_tracer ; echo funcgraph-abstime > $KDBGPATH/trace_options
echo funcgraph-proc > $KDBGPATH/trace_options ; echo 8192 > $KDBGPATH/buffer_size_kb ;
echo 0 > $KDBGPATH/tracing_on ; echo > $KDBGPATH/trace'
$ sudo insmod ./callmodule.ko && sleep 0.1 && sudo rmmod callmodule && \
tail -n25 /var/log/syslog | tee log.txt && \
sudo cat /sys/kernel/debug/tracing/trace >> log.txt

... I get a log. I want to trace the first two instructions of the main() of wtest, which for me are:

$ objdump -S wtest/wtest | grep -A3 'int main'
int main(void) {
 8048474:   55                      push   %ebp
 8048475:   89 e5                   mov    %esp,%ebp
 8048477:   83 e4 f0                and    $0xfffffff0,%esp

... at virtual addresses 0x08048474 and 0x08048475. In the syslog output, I could get, say:

...
[ 1106.383011] callmodule: parent task a: f40a9940 c: kworker/u:1 p: [14] s: stopped
[ 1106.383017] callmodule: - wtest [9404]
[ 1106.383023] callmodule: Trying to walk page table; addr task 0xEAE90CA0 ->mm ->start_code: 0x08048000 ->end_code: 0x080485F4
[ 1106.383029] callmodule: walk_ 0x8048000 callmodule: Valid pgd : Valid pud: Valid pmd: page frame struct is @ f63e5d80; *virtual (page_address) @   (null) (is_vmalloc_addr 0 virt_addr_valid 0 virt_to_phys 0x40000000) page_to_pfn 639ec page_to_phys 0x639ec000
[ 1106.383049] callmodule: walk_ 0x80483c0 callmodule: Valid pgd : Valid pud: Valid pmd: page frame struct is @ f63e5d80; *virtual (page_address) @   (null) (is_vmalloc_addr 0 virt_addr_valid 0 virt_to_phys 0x40000000) page_to_pfn 639ec page_to_phys 0x639ec000
[ 1106.383067] callmodule: walk_ 0x8048474 callmodule: Valid pgd : Valid pud: Valid pmd: page frame struct is @ f63e5d80; *virtual (page_address) @   (null) (is_vmalloc_addr 0 virt_addr_valid 0 virt_to_phys 0x40000000) page_to_pfn 639ec page_to_phys 0x639ec000
[ 1106.383083] callmodule: physaddr : (0x080483c0 ->) 0x639ec3c0 : (0x08048474 ->) 0x639ec474
[ 1106.383106] callmodule: 0x08048474 id [3]
[ 1106.383113] callmodule: 0x08048475 id [4]
[ 1106.383118] callmodule: (( 0x08048000 is_vmalloc_addr 0 virt_addr_valid 0 ))
[ 1106.383130] callmodule: cont pid task a: eae90ca0 c: wtest p: [9404] s: runnable
[ 1106.383147] initcall callmodule_init+0x0/0x1000 [callmodule] returned with preemption imbalance
[ 1106.518074] callmodule: < exit

... meaning that it mapped the virtual address 0x08048474 to physical address 0x639ec474. However, the physical is not used for hardware breakpoints - there we can supply a virtual address directly to register_user_hw_breakpoint; however, we also need to supply the task_struct of the process too. With that, I can get something like this in the ftrace output:

...
  597.907256 |   1)   wtest-5339   |               |  handle_mm_fault() {
...
  597.907310 |   1)   wtest-5339   | + 35.627 us   |      }
  597.907311 |   1)   wtest-5339   | + 46.245 us   |    }
  597.907312 |   1)   wtest-5339   | + 56.143 us   |  }
  597.907313 |   1)   wtest-5339   |   1.039 us    |  up_read();
  597.907317 |   1)   wtest-5339   |   1.285 us    |  native_get_debugreg();
  597.907319 |   1)   wtest-5339   |   1.075 us    |  native_set_debugreg();
  597.907322 |   1)   wtest-5339   |   1.129 us    |  native_get_debugreg();
  597.907324 |   1)   wtest-5339   |   1.189 us    |  native_set_debugreg();
  597.907329 |   1)   wtest-5339   |               |  () {
  597.907333 |   1)   wtest-5339   |               |  /* callmodule: hwbp hit: id [3] */
  597.907334 |   1)   wtest-5339   |   5.567 us    |  }
  597.907336 |   1)   wtest-5339   |   1.123 us    |  native_set_debugreg();
  597.907339 |   1)   wtest-5339   |   1.130 us    |  native_get_debugreg();
  597.907341 |   1)   wtest-5339   |   1.075 us    |  native_set_debugreg();
  597.907343 |   1)   wtest-5339   |   1.075 us    |  native_get_debugreg();
  597.907345 |   1)   wtest-5339   |   1.081 us    |  native_set_debugreg();
  597.907348 |   1)   wtest-5339   |               |  () {
  597.907350 |   1)   wtest-5339   |               |  /* callmodule: hwbp hit: id [4] */
  597.907351 |   1)   wtest-5339   |   3.033 us    |  }
  597.907352 |   1)   wtest-5339   |   1.105 us    |  native_set_debugreg();
  597.907358 |   1)   wtest-5339   |   1.315 us    |  down_read_trylock();
  597.907360 |   1)   wtest-5339   |   1.123 us    |  _cond_resched();
  597.907362 |   1)   wtest-5339   |   1.027 us    |  find_vma();
  597.907364 |   1)   wtest-5339   |               |  handle_mm_fault() {
...

... where the traces corresponding to the assembly are marked by breakpoint id. Thankfully, they are right after another, as expected; however, ftrace has also captured some debug commands in-between. In any case, this is what I wanted to see.

Here are some notes about the module:

Most of the module is from Execute/invoke user-space program, and get its pid, from a kernel module ; where a user process is started and pid obtained
- Since we have to get to the task_struct to get to the pid; here I save both (which is kind of redundant)
Where functions symbols are not exported; if the symbol is in kallsyms, then I use a function pointer to the address; else other needed functions are copied from source
I didn't know how to start the user-space process stopped, so after spawning I issue a SIGSTOP (which on its own, seems kind of unreliable at that point), and set state to __TASK_STOPPED).
- I may still get status "runnable" where I don't expect it sometimes - however, if the init exits early with an error, I've noticed wtest hanging in process list long after it would have terminated naturally, so I guess that works.
To get absolute/physical addresses, I used Walking page tables of a process in Linux to get to the page corresponding to a virtual address, and then digging through kernel sources I found page_to_phys() to get to the address (internally via page frame number); LDD3 ch.15 helps with understanding relationship between pfn and physical address.
- Since here I expect to have physical address, I don't use PAGE_SHIFT, but calculate offsets directly from objdump's assembly output - I am not 100% sure this is correct, though.
- Note, ( see also How to get a struct page from any address in the Linux kernel ), the module output says that the virtual address 0x08048000 is neither is_vmalloc_addr nor virt_addr_valid; I guess, this should tell me, one couldn't have used neither vmalloc_to_pfn() nor virt_to_page() to get to its physical address !?
Setting up kprobes for ftrace from kernel space is kinda tricky (needs functions copied)
- Trying to set a kprobe on the physical addresses I get (e.g. 0x639ec474), always results with "Could not insert probe(-22)"
- Just to see if the format is parsed, I'm trying with the kallsyms address of the tracing_on() function (0xc10bcf60) below; that seems to work - because it raises a fatal "BUG: scheduling while atomic" (apparently, we're not meant to set breakpoints in module_init?). Bug is fatal, because it makes the kprobes directory dissapear from the ftrace debug directory
- Just creating the kprobe would not make it appear in the ftrace log - it also needs to be enabled; the necessary code for enabling is there - but I've never tried it, because of the previous bug
Finally, the breakpoint setting is from Watch a variable (memory address) change in Linux kernel, and print stack trace when it changes?
- I've never seen an example for setting an executable hardware breakpoint; it kept failing for me, until through kernel source search, I found that for HW_BREAKPOINT_X, attr.bp_len need to be set to sizeof(long)
- If I try to printk the attr variable(s) - from _init or from the handler - something gets seriously messed up, and whatever variable I try to print next, I get value 0x5 (or 0x48) for it (?!)
- Since I'm trying to use a single handler function for both breakpoints, the only reliable piece of info that survives from _init to the handler, able to differentiate between the two, seems to be bp->id
- These id's are autoassigned, and seems they are not re-claimed if you unregister the breakpoints (I do not unregister them to avoid extra ftrace printouts).

As far as the randomness goes, I think this is because the process is not started in a stopped state; and by the time it gets stopped, it ends up in a different state (or, quite possibly, I'm missing some locking somewhere). Anyways, you can also expect in syslog:

[ 1661.815114] callmodule: Trying to walk page table; addr task 0xEAF68CA0 ->mm ->start_code: 0x08048000 ->end_code: 0x080485F4
[ 1661.815319] callmodule: walk_ 0x8048000 callmodule: Valid pgd : Valid pud: Valid pmd: page frame struct is @ f5772000; *virtual (page_address) @ c0000000 (is_vmalloc_addr 0 virt_addr_valid 1 virt_to_phys 0x0) page_to_pfn 0 page_to_phys 0x0
[ 1661.815837] callmodule: walk_ 0x80483c0 callmodule: Valid pgd : Valid pud: Valid pmd: page frame struct is @ f5772000; *virtual (page_address) @ c0000000 (is_vmalloc_addr 0 virt_addr_valid 1 virt_to_phys 0x0) page_to_pfn 0 page_to_phys 0x0
[ 1661.816846] callmodule: walk_ 0x8048474 callmodule: Valid pgd : Valid pud: Valid pmd: page frame struct is @ f5772000; *virtual (page_address) @ c0000000 (is_vmalloc_addr 0 virt_addr_valid 1 virt_to_phys 0x0) page_to_pfn 0 page_to_phys 0x0

... that is, even with a proper task pointer (judging by start_code), only 0x0 is obtained as physical address. Sometimes you get the same outcome, but with start_code: 0x00000000 ->end_code: 0x00000000. And sometimes, a task_struct cannot be obtained, even if pid can:

[  833.380417] callmodule:c: pid 7663
[  833.380424] callmodule: everything all right; pid 7663 (7663)
[  833.380430] callmodule: p is NULL - exiting
[  833.516160] callmodule: < exit

Well, hopefully someone will comment and clarify some of the behavior of this module :)
Hope this helps someone,
Cheers!

Makefile:

EXTRA_CFLAGS=-g -O0
obj-m += callmodule.o
all:
  make -C /lib/modules/$(shell uname -r)/build M=$(PWD) modules
clean:
  make -C /lib/modules/$(shell uname -r)/build M=$(PWD) clean

callmodule.c:

#include <linux/module.h>
#include <linux/slab.h> //kzalloc
#include <linux/syscalls.h> // SIGCHLD, ... sys_wait4, ...
#include <linux/kallsyms.h> // kallsyms_lookup, print_symbol
#include <linux/highmem.h> // ‘kmap_atomic’ (via pte_offset_map)
#include <asm/io.h> // page_to_phys (arch/x86/include/asm/io.h)

struct subprocess_infoB; // forward declare
// global variable - to avoid intervening too much in the return of call_usermodehelperB:
static int callmodule_pid;
static struct subprocess_infoB* callmodule_infoB;
#define TRY_USE_KPROBES 0 // 1 // enable/disable kprobes usage code
#include <linux/kprobes.h> // enable_kprobe
// for hardware breakpoint:
#include <linux/perf_event.h>
#include <linux/hw_breakpoint.h>

// define a modified struct (with extra fields) here:
struct subprocess_infoB {
  struct work_struct work;
  struct completion *complete;
  char *path;
  char **argv;
  char **envp;
  int wait; //enum umh_wait wait;
  int retval;
  int (*init)(struct subprocess_info *info);
  void (*cleanup)(struct subprocess_info *info);
  void *data;
  pid_t pid;
  struct task_struct *task;
  unsigned long long last_page_physaddr;
};

struct subprocess_infoB *call_usermodehelper_setupB(char *path, char **argv,
                          char **envp, gfp_t gfp_mask);

static inline int
call_usermodehelper_fnsB(char *path, char **argv, char **envp,
            int wait, //enum umh_wait wait,
            int (*init)(struct subprocess_info *info),
            void (*cleanup)(struct subprocess_info *), void *data)
{
  struct subprocess_info *info;
  struct subprocess_infoB *infoB;
  gfp_t gfp_mask = (wait == UMH_NO_WAIT) ? GFP_ATOMIC : GFP_KERNEL;
  int ret;

  populate_rootfs_wait();

  infoB = call_usermodehelper_setupB(path, argv, envp, gfp_mask);
  printk(KBUILD_MODNAME ":a: pid %d\n", infoB->pid);
  info = (struct subprocess_info *) infoB;

  if (info == NULL)
      return -ENOMEM;

  call_usermodehelper_setfns(info, init, cleanup, data);
  printk(KBUILD_MODNAME ":b: pid %d\n", infoB->pid);

  // this must be called first, before infoB->pid is populated (by __call_usermodehelperB):
  ret = call_usermodehelper_exec(info, wait);

  // assign global pid (and infoB) here, so rest of the code has it:
  callmodule_pid = infoB->pid;
  callmodule_infoB = infoB;    
  printk(KBUILD_MODNAME ":c: pid %d\n", callmodule_pid);

  return ret;
}

static inline int
call_usermodehelperB(char *path, char **argv, char **envp, int wait) //enum umh_wait wait)
{
  return call_usermodehelper_fnsB(path, argv, envp, wait,
                     NULL, NULL, NULL);
}

static void __call_usermodehelperB(struct work_struct *work)
{
  struct subprocess_infoB *sub_infoB =
      container_of(work, struct subprocess_infoB, work);
  int wait = sub_infoB->wait; // enum umh_wait wait = sub_info->wait;
  pid_t pid;
  struct subprocess_info *sub_info;
  // hack - declare function pointers
  int (*ptrwait_for_helper)(void *data);
  int (*ptr____call_usermodehelper)(void *data);
  // assign function pointers to verbatim addresses as obtained from /proc/kallsyms
  int killret;
  struct task_struct *spawned_task;
  ptrwait_for_helper = (void *)0xc1065b60;
  ptr____call_usermodehelper = (void *)0xc1065ed0;

  sub_info = (struct subprocess_info *)sub_infoB;

  if (wait == UMH_WAIT_PROC)
      pid = kernel_thread((*ptrwait_for_helper), sub_info, //(wait_for_helper, sub_info,
                  CLONE_FS | CLONE_FILES | SIGCHLD);
  else
      pid = kernel_thread((*ptr____call_usermodehelper), sub_info, //(____call_usermodehelper, sub_info,
                  CLONE_VFORK | SIGCHLD);

  spawned_task = pid_task(find_vpid(pid), PIDTYPE_PID);

  // stop/suspend/pause task
  killret = kill_pid(find_vpid(pid), SIGSTOP, 1); 
  if (spawned_task!=NULL) {
    // does this stop the process really?
    spawned_task->state = __TASK_STOPPED;
    printk(KBUILD_MODNAME ": : exst %d exco %d exsi %d diex %d inex %d inio %d\n", spawned_task->exit_state, spawned_task->exit_code, spawned_task->exit_signal, spawned_task->did_exec, spawned_task->in_execve, spawned_task->in_iowait);
  }
  printk(KBUILD_MODNAME ": : (kr: %d)\n", killret);
  printk(KBUILD_MODNAME ": : pid %d (%p) (%s)\n", pid, spawned_task,
    (spawned_task!=NULL)?((spawned_task->state==-1)?"unrunnable":((spawned_task->state==0)?"runnable":"stopped")):"null" );
  // grab and save the pid (and task_struct) here:
  sub_infoB->pid = pid;
  sub_infoB->task = spawned_task;
    switch (wait) {
    case UMH_NO_WAIT:
        call_usermodehelper_freeinfo(sub_info);
        break;
    case UMH_WAIT_PROC:
        if (pid > 0)
            break;
        /* FALLTHROUGH */
    case UMH_WAIT_EXEC:
        if (pid < 0)
            sub_info->retval = pid;
        complete(sub_info->complete);
    }
}

struct subprocess_infoB *call_usermodehelper_setupB(char *path, char **argv,
                          char **envp, gfp_t gfp_mask)
{
    struct subprocess_infoB *sub_infoB;
    sub_infoB = kzalloc(sizeof(struct subprocess_infoB), gfp_mask);
    if (!sub_infoB)
        goto out;

    INIT_WORK(&sub_infoB->work, __call_usermodehelperB);
    sub_infoB->path = path;
    sub_infoB->argv = argv;
    sub_infoB->envp = envp;
  out:
    return sub_infoB;
}

#if TRY_USE_KPROBES
// copy from /kernel/trace/trace_probe.c (is unexported)
int traceprobe_command(const char *buf, int (*createfn)(int, char **))
{
  char **argv;
  int argc, ret;

  argc = 0;
  ret = 0;
  argv = argv_split(GFP_KERNEL, buf, &argc);
  if (!argv)
    return -ENOMEM;

  if (argc)
    ret = createfn(argc, argv);

  argv_free(argv);

  return ret;
}

// copy from kernel/trace/trace_kprobe.c?v=2.6.38 (is unexported)
#define TP_FLAG_TRACE   1
#define TP_FLAG_PROFILE 2
typedef void (*fetch_func_t)(struct pt_regs *, void *, void *);
struct fetch_param {
  fetch_func_t    fn;
  void *data;
};
typedef int (*print_type_func_t)(struct trace_seq *, const char *, void *, void *);
enum {
  FETCH_MTD_reg = 0,
  FETCH_MTD_stack,
  FETCH_MTD_retval,
  FETCH_MTD_memory,
  FETCH_MTD_symbol,
  FETCH_MTD_deref,
  FETCH_MTD_END,
};
// Fetch type information table * /
struct fetch_type {
  const char      *name;          /* Name of type */
  size_t          size;           /* Byte size of type */
  int             is_signed;      /* Signed flag */
  print_type_func_t       print;  /* Print functions */
  const char      *fmt;           /* Fromat string */
  const char      *fmttype;       /* Name in format file */
  // Fetch functions * /
  fetch_func_t    fetch[FETCH_MTD_END];
};
struct probe_arg {
  struct fetch_param      fetch;
  struct fetch_param      fetch_size;
  unsigned int            offset; /* Offset from argument entry */
  const char              *name;  /* Name of this argument */
  const char              *comm;  /* Command of this argument */
  const struct fetch_type *type;  /* Type of this argument */
};
struct trace_probe {
  struct list_head        list;
  struct kretprobe        rp;     /* Use rp.kp for kprobe use */
  unsigned long           nhit;
  unsigned int            flags;  /* For TP_FLAG_* */
  const char              *symbol;        /* symbol name */
  struct ftrace_event_class       class;
  struct ftrace_event_call        call;
  ssize_t                 size;           /* trace entry size */
  unsigned int            nr_args;
  struct probe_arg        args[];
};
static  int probe_is_return(struct trace_probe *tp)
{
  return tp->rp.handler != NULL;
}
static int probe_event_enable(struct ftrace_event_call *call)
{
  struct trace_probe *tp = (struct trace_probe *)call->data;

  tp->flags |= TP_FLAG_TRACE;
  if (probe_is_return(tp))
    return enable_kretprobe(&tp->rp);
  else
    return enable_kprobe(&tp->rp.kp);
}
#define KPROBE_EVENT_SYSTEM "kprobes"
#endif // TRY_USE_KPROBES

// <<<<<<<<<<<<<<<<<<<<<<

static struct page *walk_page_table(unsigned long addr, struct task_struct *intask)
{
  pgd_t *pgd;
  pte_t *ptep, pte;
  pud_t *pud;
  pmd_t *pmd;

  struct page *page = NULL;
  struct mm_struct *mm = intask->mm;

  callmodule_infoB->last_page_physaddr = 0ULL; // reset here, in case of early exit

  printk(KBUILD_MODNAME ": walk_ 0x%lx ", addr);

  pgd = pgd_offset(mm, addr);
  if (pgd_none(*pgd) || pgd_bad(*pgd))
    goto out;
  printk(KBUILD_MODNAME ": Valid pgd ");

  pud = pud_offset(pgd, addr);
  if (pud_none(*pud) || pud_bad(*pud))
    goto out;
  printk( ": Valid pud");

  pmd = pmd_offset(pud, addr);
  if (pmd_none(*pmd) || pmd_bad(*pmd))
    goto out;
  printk( ": Valid pmd");

  ptep = pte_offset_map(pmd, addr);
  if (!ptep)
    goto out;
  pte = *ptep;

  page = pte_page(pte);
  if (page) {
    callmodule_infoB->last_page_physaddr = (unsigned long long)page_to_phys(page);
    printk( ": page frame struct is @ %p; *virtual (page_address) @ %p (is_vmalloc_addr %d virt_addr_valid %d virt_to_phys 0x%llx) page_to_pfn %lx page_to_phys 0x%llx", page, page_address(page), is_vmalloc_addr((void*)page_address(page)), virt_addr_valid(page_address(page)), (unsigned long long)virt_to_phys(page_address(page)), page_to_pfn(page), callmodule_infoB->last_page_physaddr);
  }

  //~ pte_unmap(ptep);

out:
  printk("\n");
  return page;
}

static void sample_hbp_handler(struct perf_event *bp,
             struct perf_sample_data *data,
             struct pt_regs *regs)
{
  trace_printk(KBUILD_MODNAME ": hwbp hit: id [%llu]\n", bp->id );
  //~ unregister_hw_breakpoint(bp);
}

// ----------------------

static int __init callmodule_init(void)
{
  int ret = 0;
  char userprog[] = "/path/to/wtest";
  char *argv[] = {userprog, "2", NULL };
  char *envp[] = {"HOME=/", "PATH=/sbin:/usr/sbin:/bin:/usr/bin", NULL };
  struct task_struct *p;
  struct task_struct *par;
  struct task_struct *pc;
  struct list_head *children_list_head;
  struct list_head *cchildren_list_head;
  char *state_str;
  unsigned long offset, taddr;
  int (*ptr_create_trace_probe)(int argc, char **argv); 
  struct trace_probe* (*ptr_find_probe_event)(const char *event, const char *group);
  //int (*ptr_probe_event_enable)(struct ftrace_event_call *call); // not exported, copy
  #if TRY_USE_KPROBES
  char trcmd[256] = "";
  struct trace_probe *tp;
  #endif //TRY_USE_KPROBES
  struct perf_event *sample_hbp, *sample_hbpb;
  struct perf_event_attr attr, attrb;

  printk(KBUILD_MODNAME ": > init %s\n", userprog);

  ptr_create_trace_probe = (void *)0xc10d5120;
  ptr_find_probe_event = (void *)0xc10d41e0;
  print_symbol(KBUILD_MODNAME ": symbol @ 0xc1065b60 is %s\n", 0xc1065b60); // shows wait_for_helper+0x0/0xb0
  print_symbol(KBUILD_MODNAME ": symbol @ 0xc1065ed0 is %s\n", 0xc1065ed0); // shows ____call_usermodehelper+0x0/0x90
  print_symbol(KBUILD_MODNAME ": symbol @ 0xc10d5120 is %s\n", 0xc10d5120); // shows create_trace_probe+0x0/0x590
  ret = call_usermodehelperB(userprog, argv, envp, UMH_WAIT_EXEC); 
  if (ret != 0)
      printk(KBUILD_MODNAME ": error in call to usermodehelper: %i\n", ret);
  else
      printk(KBUILD_MODNAME ": everything all right; pid %d (%d)\n", callmodule_pid, callmodule_infoB->pid);
  tracing_on(); // earlier, so trace_printk of handler is caught!
  // find the task:
  rcu_read_lock();
  p = pid_task(find_vpid(callmodule_pid), PIDTYPE_PID);
  rcu_read_unlock();
  if (p == NULL) {
    printk(KBUILD_MODNAME ": p is NULL - exiting\n");
    return 0;
  }
  state_str = (p->state==-1)?"unrunnable":((p->state==0)?"runnable":"stopped");
  printk(KBUILD_MODNAME ": pid task a: %p c: %s p: [%d] s: %s\n",
    p, p->comm, p->pid, state_str);
  // find parent task:
  par = p->parent;
  if (par == NULL) {
    printk(KBUILD_MODNAME ": par is NULL - exiting\n");
    return 0;
  }
  state_str = (par->state==-1)?"unrunnable":((par->state==0)?"runnable":"stopped");
  printk(KBUILD_MODNAME ": parent task a: %p c: %s p: [%d] s: %s\n",
    par, par->comm, par->pid, state_str);

  // iterate through parent's (and our task's) child processes:
  rcu_read_lock(); // read_lock(&tasklist_lock);
  list_for_each(children_list_head, &par->children){
    p = list_entry(children_list_head, struct task_struct, sibling);
    printk(KBUILD_MODNAME ": - %s [%d] \n", p->comm, p->pid);
    if (p->pid == callmodule_pid) {
      list_for_each(cchildren_list_head, &p->children){
        pc = list_entry(cchildren_list_head, struct task_struct, sibling);
        printk(KBUILD_MODNAME ": - - %s [%d] \n", pc->comm, pc->pid);
      }
    }
  }
  rcu_read_unlock(); //~ read_unlock(&tasklist_lock);

  // NOTE: here p == callmodule_infoB->task !!
  printk(KBUILD_MODNAME ": Trying to walk page table; addr task 0x%X ->mm ->start_code: 0x%08lX ->end_code: 0x%08lX \n", (unsigned int) callmodule_infoB->task, callmodule_infoB->task->mm->start_code, callmodule_infoB->task->mm->end_code);
  walk_page_table(0x08048000, callmodule_infoB->task);
  // 080483c0 is start of .text; 08048474 start of main; for objdump -S wtest
  walk_page_table(0x080483c0, callmodule_infoB->task);
  walk_page_table(0x08048474, callmodule_infoB->task);

  if (callmodule_infoB->last_page_physaddr != 0ULL) {
    printk(KBUILD_MODNAME ": physaddr ");
    taddr = 0x080483c0; // .text
    offset = taddr - callmodule_infoB->task->mm->start_code;
    printk(": (0x%08lx ->) 0x%08llx ", taddr, callmodule_infoB->last_page_physaddr+offset);
    taddr = 0x08048474; // main
    offset = taddr - callmodule_infoB->task->mm->start_code;
    printk(": (0x%08lx ->) 0x%08llx ", taddr, callmodule_infoB->last_page_physaddr+offset);
    printk("\n");

    #if TRY_USE_KPROBES // can't use this here (BUG: scheduling while atomic, if probe inserts)
    //~ sprintf(trcmd, "p:myprobe 0x%08llx", callmodule_infoB->last_page_physaddr+offset);
    // try symbol for c10bcf60 - tracing_on
    sprintf(trcmd, "p:myprobe 0x%08llx", (unsigned long long)0xc10bcf60);
    ret = traceprobe_command(trcmd, ptr_create_trace_probe); //create_trace_probe);
    printk("%s -- ret: %d\n", trcmd, ret);
    // try find probe and enable it (compiles, but untested):
    tp = ptr_find_probe_event("myprobe", KPROBE_EVENT_SYSTEM);
    if (tp != NULL) probe_event_enable(&tp->call);
    #endif //TRY_USE_KPROBES
  }

  hw_breakpoint_init(&attr);
  attr.bp_len = sizeof(long); //HW_BREAKPOINT_LEN_1;
  attr.bp_type = HW_BREAKPOINT_X ;
  attr.bp_addr = 0x08048474; // main
  sample_hbp = register_user_hw_breakpoint(&attr, (perf_overflow_handler_t)sample_hbp_handler, p);
  printk(KBUILD_MODNAME ": 0x08048474 id [%llu]\n", sample_hbp->id); //
  if (IS_ERR((void __force *)sample_hbp)) {
    int ret = PTR_ERR((void __force *)sample_hbp);
    printk(KBUILD_MODNAME ": Breakpoint registration failed (%d)\n", ret);
    //~ return ret;
  }

  hw_breakpoint_init(&attrb);
  attrb.bp_len = sizeof(long);
  attrb.bp_type = HW_BREAKPOINT_X ;
  attrb.bp_addr = 0x08048475; // first instruction after main
  sample_hbpb = register_user_hw_breakpoint(&attrb, (perf_overflow_handler_t)sample_hbp_handler, p);
  printk(KBUILD_MODNAME ": 0x08048475 id [%llu]\n", sample_hbpb->id); //45
  if (IS_ERR((void __force *)sample_hbpb)) {
    int ret = PTR_ERR((void __force *)sample_hbpb);
    printk(KBUILD_MODNAME ": Breakpoint registration failed (%d)\n", ret);
    //~ return ret;
  }

  printk(KBUILD_MODNAME ": (( 0x08048000 is_vmalloc_addr %d virt_addr_valid %d ))\n", is_vmalloc_addr((void*)0x08048000), virt_addr_valid(0x08048000));

  kill_pid(find_vpid(callmodule_pid), SIGCONT, 1); // resume/continue/restart task
  state_str = (p->state==-1)?"unrunnable":((p->state==0)?"runnable":"stopped");
  printk(KBUILD_MODNAME ": cont pid task a: %p c: %s p: [%d] s: %s\n",
    p, p->comm, p->pid, state_str);

  return 0;
}

static void __exit callmodule_exit(void)
{
  tracing_off(); //corresponds to the user space /sys/kernel/debug/tracing/tracing_on file
  printk(KBUILD_MODNAME ": < exit\n");
}

module_init(callmodule_init);
module_exit(callmodule_exit);
MODULE_LICENSE("GPL");

这篇关于用ftrace和kprobe捕获用户空间组合（通过使用虚拟地址转换）？的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

用ftrace和kprobe捕获用户空间组合（通过使用虚拟地址转换）？ [英] Capturing user-space assembly with ftrace and kprobes (by using virtual address translation)?

问题描述

相关文章

服务器开发最新文章

热门教程

热门工具

登录关闭

用ftrace和kprobe捕获用户空间组合（通过使用虚拟地址转换）？ [英] Capturing user-space assembly with ftrace and kprobes (by using virtual address translation)?

问题描述

相关文章

服务器开发最新文章

热门教程

热门工具

登录 关闭

登录关闭