为什么我可以在 Linux 内核模块中执行浮点运算? [英] Why am I able to perform floating point operations inside a Linux kernel module?

查看:20
本文介绍了为什么我可以在 Linux 内核模块中执行浮点运算?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在 x86 CentOS 6.3(内核 v2.6.32)系统上运行.

I'm running on an x86 CentOS 6.3 (kernel v2.6.32) system.

我将以下函数编译成一个简单的字符驱动程序模块作为实验,以了解 Linux 内核对浮点运算的反应.

I compiled the following function into a bare-bones character driver module as an experiment to see how the Linux kernel reacts to floating point operations.

static unsigned floatstuff(void){
    float x = 3.14;
    x *= 2.5;
    return x;
}

...

printk(KERN_INFO "x: %u", x);

编译的代码(出乎意料)所以我插入了模块并使用dmesg检查了日志.日志显示:x: 7.

The code compiled (which wasn't expecting) so I inserted the module and checked the log with dmesg. The log showed: x: 7.

这看起来很奇怪;我以为你不能在 Linux 内核中执行浮点运算——保存一些例外,比如 kernel_fpu_begin().模块是如何进行浮点运算的?

This seems strange; I thought you couldn't perform floating point operations in the Linux kernel -- save some exceptions such as kernel_fpu_begin(). How did the module perform the floating point operation?

这是因为我使用的是 x86 处理器吗?

Is this because I'm on an x86 processor?

推荐答案

我以为你不能在 Linux 内核中执行浮点运算

I thought you couldn't perform floating point operations in the Linux kernel

你不能安全地:未能使用 kernel_fpu_begin()/kernel_fpu_end() 并不意味着 FPU 指令会出错(不是至少在 x86 上).

You can't safely: failure to use kernel_fpu_begin() / kernel_fpu_end() doesn't mean FPU instructions will fault (not on x86 at least).

相反,它会悄悄地破坏用户空间的 FPU 状态.这不好;不要那样做.

Instead it will silently corrupt user-space's FPU state. This is bad; don't do that.

编译器不知道 kernel_fpu_begin() 是什么意思,因此它无法检查/警告编译为 FPU 开始区域之外的 FPU 指令的代码.

The compiler doesn't know what kernel_fpu_begin() means, so it can't check / warn about code that compiles to FPU instructions outside of FPU-begin regions.

可能存在一种调试模式,其中内核确实禁用了 kernel_fpu_begin/end 区域之外的 SSE、x87 和 MMX 指令,但这会更慢并且不是't 默认完成.

There may be a debug mode where the kernel does disable SSE, x87, and MMX instructions outside of kernel_fpu_begin / end regions, but that would be slower and isn't done by default.

虽然有可能:设置 CR0::TS = 1 会使 x87 指令出错,因此延迟 FPU 上下文切换是可能的,并且还有其他位用于 SSE 和 AVX.

It is possible, though: setting CR0::TS = 1 makes x87 instructions fault, so lazy FPU context switching is possible, and there are other bits for SSE and AVX.

有缺陷的内核代码有多种方式导致严重问题.这只是众多之一.在 C 中,您几乎总是知道何时使用浮点数(除非打字错误导致 1. 常量或实际编译的上下文中的某些内容).

There are many ways for buggy kernel code to cause serious problems. This is just one of many. In C, you pretty much always know when you're using floating point (unless a typo results in a 1. constant or something in a context that actually compiles).

为什么 FP 架构状态与整数不同?

Linux 每次进入/退出内核时都必须保存/恢复整数状态.所有代码都需要使用整数寄存器(除了以 jmp 而不是 ret (ret> 修改 rsp).)

Linux has to save/restore the integer state any time it enters/exits the kernel. All code needs to use integer registers (except for a giant straight-line block of FPU computation that ends with a jmp instead of a ret (ret modifies rsp).)

但内核代码通常会避免 FPU,因此 Linux 在系统调用进入时不保存 FPU 状态,仅在实际上下文切换到不同的用户空间进程或之前保存kernel_fpu_begin.否则,返回到同一个内核上的同一个用户空间进程是很常见的,因此不需要恢复 FPU 状态,因为内核没有接触它.(如果内核任务确实修改了 FPU 状态,就会发生损坏.我认为这是双向的:用户空间也可能损坏您的 FPU 状态.

But kernel code avoids FPU generally, so Linux leaves the FPU state unsaved on entry from a system call, only saving before an actual context switch to a different user-space process or on kernel_fpu_begin. Otherwise, it's common to return to the same user-space process on the same core, so FPU state doesn't need to be restored because the kernel didn't touch it. (And this is where corruption would happen if a kernel task actually did modify the FPU state. I think this goes both ways: user-space could also corrupt your FPU state).

整数状态相当小,只有 16 个 64 位寄存器 + RFLAGS 和段寄存器.即使没有 AVX,FPU 状态也是两倍多:8x 80 位 x87 寄存器,16x XMM 或 YMM,或 32x ZMM 寄存器(+ MXCSR,和 x87 状态 + 控制字).MPX bnd0-4 寄存器也与FPU"混为一谈.此时FPU 状态"仅表示所有非整数寄存器.在我的 Skylake 上,dmesgx86/fpu: Enabled xstate features 0x1f,上下文大小为 960 字节,使用压缩"格式.

The integer state is fairly small, only 16x 64-bit registers + RFLAGS and segment regs. FPU state is more than twice as large even without AVX: 8x 80-bit x87 registers, and 16x XMM or YMM, or 32x ZMM registers (+ MXCSR, and x87 status + control words). Also the MPX bnd0-4 registers are lumped in with "FPU". At this point "FPU state" just means all non-integer registers. On my Skylake, dmesg says x86/fpu: Enabled xstate features 0x1f, context size is 960 bytes, using 'compacted' format.

参见了解Linux内核中的FPU使用;默认情况下,现代 Linux 不会为上下文切换(仅用于内核/用户转换)进行惰性 FPU 上下文切换.(但那篇文章解释了什么是懒惰.)

See Understanding FPU usage in linux kernel; modern Linux doesn't do lazy FPU context switches by default for context switches (only for kernel/user transitions). (But that article explains what Lazy is.)

大多数进程使用 SSE 来复制/清零编译器生成代码中的小内存块,并且大多数库字符串/memcpy/memset 实现使用 SSE/SSE2.此外,硬件支持优化保存/恢复现在是一件事(xsaveopt/xrstor),所以如果一些/所有 FP 寄存器实际上没有被使用,那么急切"的 FPU 保存/恢复实际上可能会做更少的工作.例如如果用 vzeroupper 将 YMM 寄存器清零,则只保存 YMM 寄存器的低 128b,以便 CPU 知道它们是干净的.(并在保存格式中仅用一位标记这一事实.)

Most processes use SSE for copying/zeroing small blocks of memory in compiler-generated code, and most library string/memcpy/memset implementations use SSE/SSE2. Also, hardware supported optimized save/restore is a thing now (xsaveopt / xrstor), so "eager" FPU save/restore may actually do less work if some/all FP registers haven't actually been used. e.g. save just the low 128b of YMM registers if they were zeroed with vzeroupper so the CPU knows they're clean. (And mark that fact with just one bit in the save format.)

通过急切"的上下文切换,FPU 指令始终保持启用状态,因此糟糕的内核代码可能会随时破坏它们.

With "eager" context switching, FPU instructions stay enabled all the time, so bad kernel code can corrupt them at any time.

这篇关于为什么我可以在 Linux 内核模块中执行浮点运算?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆