在kernel_fpu_end之前两次调用kernel_fpu_begin [英] Calling kernel_fpu_begin twice before kernel_fpu_end

查看:130
本文介绍了在kernel_fpu_end之前两次调用kernel_fpu_begin的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在解决方案

简短的回答:不,嵌套kernel_fpu_begin()调用是不正确的,这将导致用户空间FPU状态损坏.

中等答案:这将不起作用,因为kernel_fpu_begin()使用当前线程的struct task_struct来保存FPU状态(task_struct具有与体系结构相关的成员thread,并且在x86上,thread.fpu保持线程的FPU状态),再执行第二个kernel_fpu_begin()将会覆盖原始保存的状态.然后,执行kernel_fpu_end()最终将恢复错误的FPU状态.

长答案:正如您看到的在<asm/i387.h>中的实际实现一样,细节有些棘手.在较早的内核中(例如您所查看的3.2源代码),FPU处理始终是惰性"的-内核希望避免在真正需要FPU之前重新加载FPU的开销,因为该线程可能会运行并再次被调度无需实际使用FPU或不需要其FPU状态.因此,kernel_fpu_end()只是设置TS标志,这将导致FPU的下一次访问被捕获并导致FPU状态被重新加载.希望我们实际上没有足够的时间使用FPU来使整体价格便宜.

但是,如果您查看较新的内核(我相信是3.7或更新的内核),您会发现实际上所有这些都有第二条代码路径-渴望"的FPU.这是因为较新的CPU具有优化的" XSAVEOPT指令,并且较新的用户空间更频繁地使用FPU(用于memcpy中的SSE等). XSAVEOPT/XRSTOR的成本更低,并且懒惰优化实际上避免了FPU重载的机会也更少,因此对于新CPU上的新内核,kernel_fpu_end()可以继续并恢复FPU状态. (

但是,在懒惰"和急切" FPU模式下,task_struct中仍然只有一个插槽可以保存FPU状态,因此嵌套kernel_fpu_begin()将最终破坏用户空间的FPU状态.

I'm using the kernel_fpu_begin and kernel_fpu_end functions in asm/i387.h to protect the FPU register states for some simple floating point arithmetic inside of a Linux kernel module.

I'm curious about the behavior of calling the kernel_fpu_begin function twice before the kernel_fpu_end function, and vice versa. For example:

#include <asm/i387.h>

double foo(unsigned num){
    kernel_fpu_begin();

    double x = 3.14;
    x += num;

    kernel_fpu_end();

    return x;
}

...

kernel_fpu_begin();

double y = 1.23;
unsigned z = 42;
y -= foo(z);

kernel_fpu_end();

In the foo function, I call kernel_fpu_begin and kernel_fpu_end; but kernel_fpu_begin was already called before the call to foo. Would this result in undefined behavior?

Furthermore, should I even be calling kernel_fpu_end inside the foo function? I return a double after the kernel_fpu_end call, which means accessing floating point registers is unsafe right?

My initial guess is just not to use the kernel_fpu_begin and kernel_fpu_end calls inside the foo function; but what if foo returned the double cast to unsigned instead -- the programmer wouldn't know to use kernel_fpu_begin and kernel_fpu_end outside of foo?

解决方案

Short answer: no, it is incorrect to nest kernel_fpu_begin() calls, and it will lead to the userspace FPU state getting corrupted.

Medium answer: This won't work because kernel_fpu_begin() use the current thread's struct task_struct to save off the FPU state (task_struct has an architecture-dependent member thread, and on x86, thread.fpu holds the thread's FPU state), and doing a second kernel_fpu_begin() will overwrite the original saved state. Then doing kernel_fpu_end() will end up restoring the wrong FPU state.

Long answer: As you saw looking at the actual implementation in <asm/i387.h>, the details are a bit tricky. In older kernels (like the 3.2 source you looked at), the FPU handling is always "lazy" -- the kernel wants to avoid the overhead of reloading the FPU until it really needs it, because the thread might run and be scheduled out again without ever actually using the FPU or needing its FPU state. So kernel_fpu_end() just sets the TS flag, which causes the next access of the FPU to trap and cause the FPU state to be reloaded. The hope is that we don't actually use the FPU enough of the time for this to be cheaper overall.

However, if you look at newer kernels (3.7 or newer, I believe), you'll see that there is actually a second code path for all of this -- "eager" FPU. This is because newer CPUs have the "optimized" XSAVEOPT instruction, and newer userspace uses the FPU more often (for SSE in memcpy, etc). The cost of XSAVEOPT / XRSTOR is less and the chance of the lazy optimization actually avoiding an FPU reload is less too, so with a new kernel on a new CPU, kernel_fpu_end() just goes ahead and restores the FPU state. (

However in both the "lazy" and "eager" FPU modes, there is still only one slot in the task_struct to save the FPU state, so nesting kernel_fpu_begin() will end up corrupting userspace's FPU state.

这篇关于在kernel_fpu_end之前两次调用kernel_fpu_begin的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆