ftrace:通过echo从function_graph更改current_tracer时,系统崩溃 [英] ftrace: system crash when changing current_tracer from function_graph via echo
问题描述
我最近一直在使用ftrace来监视系统的某些行为特征.我一直在通过一个小的脚本来打开/关闭跟踪.运行脚本后,我的系统将崩溃并自行重启.最初,我认为脚本本身可能存在错误,但是从那以后,我确定崩溃和重新启动是echo
在current_tracer
时将一些示踪剂添加到/sys/kernel/debug/tracing/current_tracer的结果设置为function_graph.
I have been playing with ftrace recently to monitor some behavior characteristics of my system. I've been handling switching the trace on/off via a small script. After running the script, my system would crash and reboot itself. Initially, I believed that there might be an error with the script itself, but I have since determined that the crash and reboot is a result of echo
ing some tracer to /sys/kernel/debug/tracing/current_tracer when current_tracer
is set to function_graph.
也就是说,以下命令序列将导致崩溃/重启:
That is, the following sequence of commands will produce the crash/reboot:
echo "function_graph" > /sys/kernel/debug/tracing/current_tracer
echo "function" > /sys/kernel/debug/tracing/current_tracer
在由于上述echo
语句导致的崩溃之后重新启动,我看到很多输出内容如下:
Durning the reboot after the crash caused by the above echo
statements, I see a lot of output that reads:
清除孤立的inode
<inode>
我试图通过将function_graph中的current_tracer
值替换为C程序中的其他内容来重现此问题:
I tried to reproduce this problem by replacing the current_tracer
value from function_graph to something else in a C program:
#include <stdio.h>
#include <fcntl.h>
#include <unistd.h>
#include <string.h>
#include <stdlib.h>
int openCurrentTracer()
{
int fd = open("/sys/kernel/debug/tracing/current_tracer", O_WRONLY);
if(fd < 0)
exit(1);
return fd;
}
int writeTracer(int fd, char* tracer)
{
if(write(fd, tracer, strlen(tracer)) != strlen(tracer)) {
printf("Failure writing %s\n", tracer);
return 0;
}
return 1;
}
int main(int argc, char* argv[])
{
int fd = openCurrentTracer();
char* blockTracer = "blk";
if(!writeTracer(fd, blockTracer))
return 1;
close(fd);
fd = openCurrentTracer();
char* graphTracer = "function_graph";
if(!writeTracer(fd, graphTracer))
return 1;
close(fd);
printf("Preparing to fail!\n");
fd = openCurrentTracer();
if(!writeTracer(fd, blockTracer))
return 1;
close(fd);
return 0;
}
奇怪的是,C程序不会使我的系统崩溃.
Oddly enough, the C program does not crash my system.
我最初在使用Ubuntu(Unity环境)16.04 LTS时遇到此问题,并确认它是4.4.0和4.5.5内核上的问题.我还在4.2.0和4.5.5内核上运行Ubuntu(配合环境)15.10的计算机上测试了此问题,但无法重现该问题.这只会让我更加困惑.
I originally encountered this problem while using Ubuntu (Unity environment) 16.04 LTS and confirmed it to be an issue on the 4.4.0 and 4.5.5 kernels. I have also tested this issue on a machine running Ubuntu (Mate environment) 15.10, on the 4.2.0 and 4.5.5 kernels, but was unable to reproduce the issue. This has only confused me further.
任何人都可以让我了解正在发生的事情吗?具体来说,为什么我可以write()
但不能echo
到/sys/kernel/debug/tracing/current_tracer?
Can anyone give me insight on what is happening? Specifically, why would I be able to write()
but not echo
to /sys/kernel/debug/tracing/current_tracer?
更新
正如vielmetti所指出的,其他人也有类似的问题(见此处).
As vielmetti pointed out, others have had a similar issue (as seen here).
ftrace_disable_ftrace_graph_caller()
在以下位置修改jmp指令ftrace_graph_call
假设它在jmp(e9)附近有5个字节. 但是,这是一个简短的jmp,仅包含2个字节(eb).和ftrace_stub()
位于ftrace_graph_caller
的正下方,因此 上面的修改破坏了导致内核oops的指令ftrace_stub()
具有无效的操作码,如下所示:
The
ftrace_disable_ftrace_graph_caller()
modifies jmp instruction atftrace_graph_call
assuming it's a 5 bytes near jmp (e9 ). However it's a short jmp consisting of 2 bytes only (eb ). Andftrace_stub()
is located just below theftrace_graph_caller
so modification above breaks the instruction resulting in kernel oops on theftrace_stub()
with the invalid opcode like below:
修补程序(如下所示)解决了echo
问题,但是我仍然不明白为什么echo
之前被破坏了,而write()
没有被破坏.
The patch (shown below) solved the echo
issue, but I still do not understand why echo
was breaking previously when write()
was not.
diff --git a/arch/x86/kernel/mcount_64.S b/arch/x86/kernel/mcount_64.S
index ed48a9f465f8..e13a695c3084 100644
--- a/arch/x86/kernel/mcount_64.S
+++ b/arch/x86/kernel/mcount_64.S
@@ -182,7 +182,8 @@ GLOBAL(ftrace_graph_call)
jmp ftrace_stub
#endif
-GLOBAL(ftrace_stub)
+/* This is weak to keep gas from relaxing the jumps */
+WEAK(ftrace_stub)
retq
END(ftrace_caller)
通过 https://lkml.org/lkml/2016/5/16/493
推荐答案
看起来您不是唯一注意到此行为的人.我明白了
Looks like you are not the only person to notice this behavior. I see
作为问题的报告,
作为解决该问题的内核的补丁.通读整个线程,看来问题出在某些编译器优化上.
as a patch to the kernel that addresses it. Reading through that whole thread it appears that the issue is some compiler optimizations.
这篇关于ftrace:通过echo从function_graph更改current_tracer时,系统崩溃的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!