为什么使用段错误而不是特权指令错误? [英] Why a segfault instead of privilege instruction error?

查看:185
本文介绍了为什么使用段错误而不是特权指令错误?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试在用户模式下执行特权指令rdmsr,但我希望得到某种特权错误,但我却遇到了段错误.我已经根据手册,第1171页.

I am trying to execute the privileged instruction rdmsr in user mode, and I expect to get some kind of privilege error, but I get a segfault instead. I have checked the asm and I am loading 0x186 into ecx, which is supposed to be PERFEVTSEL0, based on the manual, page 1171.

造成段错误的原因是什么,如何修改下面的代码以解决该问题?

我想在破解内核模块之前解决此问题,因为我不希望此段错误破坏内核.

I want to resolve this before hacking a kernel module, because I don't want this segfault to blow up my kernel.

更新:我正在Intel(R) Xeon(R) CPU X3470上运行.

#define _GNU_SOURCE

#include <stdio.h>
#include <stdlib.h>
#include <inttypes.h>

#include <sched.h>
#include <assert.h>

uint64_t
read_msr(int ecx)
{
    unsigned int a, d;
    __asm __volatile("rdmsr" : "=a"(a), "=d"(d) : "c"(ecx));
    return ((uint64_t)a) | (((uint64_t)d) << 32);
}

int main(int ac, char **av)
{
    uint64_t start, end;
    cpu_set_t cpuset;
    unsigned int c = 0x186;
    int i = 0;

    CPU_ZERO(&cpuset);
        CPU_SET(i, &cpuset);
        assert(sched_setaffinity(0, sizeof(cpuset), &cpuset) == 0);

    printf("%lu\n", read_msr(c));
    return 0;
}

推荐答案

我将尝试回答的问题:上面的代码为什么会导致SIGSEGV而不是SIGILL,尽管该代码没有内存错误,但是非法指令(从非特权用户的步调调用的特权指令)?

The question I will try to answer: Why does the above code cause SIGSEGV instead of SIGILL, though the code has no memory error, but an illegal instruction (a privileged instruction called from non-privileged user pace)?

我也希望得到带有si_code ILL_PRVOPCSIGILL而不是段错误.您的问题目前只有3岁,而今天,我偶然发现了相同的行为.我也很失望:-(

I would expect to get a SIGILL with si_code ILL_PRVOPC instead of a segfault, too. Your question is currently 3 years old and today, I stumbled upon the same behavior. I am disappointed too :-(

造成段错误的原因是什么

What is the cause of the segfault

原因似乎是Linux内核代码决定发送SIGSEGV.这是负责的功能: http://elixir. free-electrons.com/linux/v4.9/source/arch/x86/kernel/traps.c#L487 看一下函数的最后一行.

The cause seems to be that the Linux kernel code decides to send SIGSEGV. Here is the responsible function: http://elixir.free-electrons.com/linux/v4.9/source/arch/x86/kernel/traps.c#L487 Have a look at the last line of the function.

In your follow up question, you got a list of other assembly instructions which get propagated as SIGSEGV to userspace though they are actually general protection faults. I found your question because I triggered the behavior with cli.

以及如何修改下面的代码以解决该问题?

and how can I modify the code below to fix it?

从Linux内核4.9开始,我不知道在内存错误(我希望是SIGSEGV)和用户空间中的特权指令错误之间区别的任何可靠方法.

As of Linux kernel 4.9, I'm not aware of any reliable way to distinguish between a memory error (what I would expect to be a SIGSEGV) and a privileged instruction error from userspace.

可能有一些很笨拙且难以携带的方式来区分这些情况.当特权指令导致SIGSEGV时,将siginfo_t si_code设置为未直接在man 2 sigactionSIGSEGV部分中列出的值.记录的值是SEGV_MAPERRSEGV_ACCERRSEGV_PKUERR,但是我的系统上显示为SI_KERNEL(0x80).根据手册页,SI_KERNEL是一个代码可以将si_code放在任何信号中".在strace中,您看到SIGSEGV {si_signo=SIGSEGV, si_code=SI_KERNEL, si_addr=0}.负责的内核代码是此处

There may be very hacky and unportable way to distibguish these cases. When a privileged instruction causes a SIGSEGV, the siginfo_t si_code is set to a value which is not directly listed in the SIGSEGV section of man 2 sigaction. The documented values are SEGV_MAPERR, SEGV_ACCERR, SEGV_PKUERR, but I get SI_KERNEL (0x80) on my system. According to the man page, SI_KERNEL is a code "which can be placed in si_code for any signal". In strace, you see SIGSEGV {si_signo=SIGSEGV, si_code=SI_KERNEL, si_addr=0}. The responsible kernel code is here.

也可以为

It would also be possible to grep dmesg for this string.

请永远不要使用这两种方法在生产系统上区分GPF和内存错误.

Please, never ever use those two methods to distinguish between GPF and memory error on a production system.

您的代码的特定解决方案:只是不要从用户空间运行rdmsr.但是,如果您正在寻找一种通用的方法来找出为什么程序收到SIGSEGV的原因,那么这个答案确实令人不满意.

Specific solution for your code: Just don't run rdmsr from user space. But this answer is really unsatisfying if you are looking for a generic way to figure out why a program received a SIGSEGV.

这篇关于为什么使用段错误而不是特权指令错误?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆