ftrace:通过echo从function_graph更改current_tracer时，系统崩溃 [英] ftrace: system crash when changing current_tracer from function_graph via echo

查看：737 发布时间：2020/5/2 3:42:40 linux shell linux-kernel echo ftrace

本文介绍了ftrace:通过echo从function_graph更改current_tracer时，系统崩溃的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我最近一直在使用ftrace来监视系统的某些行为特征.我一直在通过一个小的脚本来打开/关闭跟踪.运行脚本后，我的系统将崩溃并自行重启.最初，我认为脚本本身可能存在错误，但是从那以后，我确定崩溃和重新启动是echo在current_tracer时将一些示踪剂添加到/sys/kernel/debug/tracing/current_tracer的结果设置为function_graph.

I have been playing with ftrace recently to monitor some behavior characteristics of my system. I've been handling switching the trace on/off via a small script. After running the script, my system would crash and reboot itself. Initially, I believed that there might be an error with the script itself, but I have since determined that the crash and reboot is a result of echoing some tracer to /sys/kernel/debug/tracing/current_tracer when current_tracer is set to function_graph.

也就是说，以下命令序列将导致崩溃/重启:

That is, the following sequence of commands will produce the crash/reboot:

echo "function_graph" > /sys/kernel/debug/tracing/current_tracer
echo "function" > /sys/kernel/debug/tracing/current_tracer

在由于上述echo语句导致的崩溃之后重新启动，我看到很多输出内容如下:

Durning the reboot after the crash caused by the above echo statements, I see a lot of output that reads:

清除孤立的inode <inode>

我试图通过将function_graph中的current_tracer值替换为C程序中的其他内容来重现此问题:

I tried to reproduce this problem by replacing the current_tracer value from function_graph to something else in a C program:

#include <stdio.h>
#include <fcntl.h>
#include <unistd.h>
#include <string.h>
#include <stdlib.h>

int openCurrentTracer()
{
        int fd = open("/sys/kernel/debug/tracing/current_tracer", O_WRONLY);
        if(fd < 0)
                exit(1);

        return fd;
}

int writeTracer(int fd, char* tracer)
{
        if(write(fd, tracer, strlen(tracer)) != strlen(tracer)) {
                printf("Failure writing %s\n", tracer);
                return 0;
        }

        return 1;
}

int main(int argc, char* argv[])
{
        int fd = openCurrentTracer();

        char* blockTracer = "blk";
        if(!writeTracer(fd, blockTracer))
                return 1;
        close(fd);

        fd = openCurrentTracer();
        char* graphTracer = "function_graph";
        if(!writeTracer(fd, graphTracer))
                return 1;
        close(fd);

        printf("Preparing to fail!\n");

        fd = openCurrentTracer();
        if(!writeTracer(fd, blockTracer))
                return 1;
        close(fd);

        return 0;
}

奇怪的是，C程序不会使我的系统崩溃.

Oddly enough, the C program does not crash my system.

我最初在使用Ubuntu(Unity环境)16.04 LTS时遇到此问题，并确认它是4.4.0和4.5.5内核上的问题.我还在4.2.0和4.5.5内核上运行Ubuntu(配合环境)15.10的计算机上测试了此问题，但无法重现该问题.这只会让我更加困惑.

I originally encountered this problem while using Ubuntu (Unity environment) 16.04 LTS and confirmed it to be an issue on the 4.4.0 and 4.5.5 kernels. I have also tested this issue on a machine running Ubuntu (Mate environment) 15.10, on the 4.2.0 and 4.5.5 kernels, but was unable to reproduce the issue. This has only confused me further.

任何人都可以让我了解正在发生的事情吗?具体来说，为什么我可以write()但不能echo到/sys/kernel/debug/tracing/current_tracer?

Can anyone give me insight on what is happening? Specifically, why would I be able to write() but not echo to /sys/kernel/debug/tracing/current_tracer?

更新

正如vielmetti所指出的，其他人也有类似的问题(见此处).

As vielmetti pointed out, others have had a similar issue (as seen here).

ftrace_disable_ftrace_graph_caller()在以下位置修改jmp指令 ftrace_graph_call假设它在jmp(e9)附近有5个字节. 但是，这是一个简短的jmp，仅包含2个字节(eb).和 ftrace_stub()位于ftrace_graph_caller的正下方，因此上面的修改破坏了导致内核oops的指令 ftrace_stub()具有无效的操作码，如下所示:

The ftrace_disable_ftrace_graph_caller() modifies jmp instruction at ftrace_graph_call assuming it's a 5 bytes near jmp (e9 ). However it's a short jmp consisting of 2 bytes only (eb ). And ftrace_stub() is located just below the ftrace_graph_caller so modification above breaks the instruction resulting in kernel oops on the ftrace_stub() with the invalid opcode like below:

修补程序(如下所示)解决了echo问题，但是我仍然不明白为什么echo之前被破坏了，而write()没有被破坏.

The patch (shown below) solved the echo issue, but I still do not understand why echo was breaking previously when write() was not.

diff --git a/arch/x86/kernel/mcount_64.S b/arch/x86/kernel/mcount_64.S
index ed48a9f465f8..e13a695c3084 100644
--- a/arch/x86/kernel/mcount_64.S
+++ b/arch/x86/kernel/mcount_64.S
@@ -182,7 +182,8 @@ GLOBAL(ftrace_graph_call)
    jmp ftrace_stub
  #endif

 -GLOBAL(ftrace_stub)
 +/* This is weak to keep gas from relaxing the jumps */
 +WEAK(ftrace_stub)
    retq
  END(ftrace_caller)

通过 https://lkml.org/lkml/2016/5/16/493

ftrace:通过echo从function_graph更改current_tracer时，系统崩溃 [英] ftrace: system crash when changing current_tracer from function_graph via echo

问题描述

推荐答案

相关文章

服务器开发最新文章

热门教程

热门工具

登录关闭

ftrace:通过echo从function_graph更改current_tracer时，系统崩溃 [英] ftrace: system crash when changing current_tracer from function_graph via echo

问题描述

推荐答案

相关文章

服务器开发最新文章

热门教程

热门工具

登录 关闭

登录关闭