ARM、VFP、浮点、惰性上下文切换 [英] ARM, VFP, floating-point, lazy context switching
问题描述
我正在为 ARM 处理器 (Cortex-A9) 编写操作系统.
I am writing an operating system for an ARM processor (Cortex-A9).
我正在尝试实现浮点寄存器的延迟上下文切换.这背后的想法是浮点扩展最初为线程禁用,因此无需在任务切换时保存浮点上下文.
I am trying to implement lazy context switching of the floating-point registers. The idea behind this is that the floating-point extension is initially disabled for a thread and so there is no need to save floating-point context on a task-switch.
当线程尝试使用浮点指令时,它会触发异常.操作系统然后启用浮点扩展并且知道必须在下一次上下文切换中为此线程保存浮点上下文.然后重新执行浮点指令.
When a thread attempts to use a floating-point instruction, it triggers an exception. The operating system then enables floating-point extension and knows that floating-point context must be saved for this thread in the next context switches. Then the floating-point instruction is re-executed.
我的问题是,即使在 c 代码中没有使用浮点运算,编译器也会生成浮点指令.这是在 c 中不使用浮点的函数的反汇编示例:
My problem is that the compiler generates floating-point instructions even when no floating-point operations are used in c code. This is an example of a disassembly of a function that uses no floating point in c:
10002f5c <rmtcpy_from>:
10002f5c: e1a0c00d mov ip, sp
10002f60: e92ddff0 push {r4, r5, r6, r7, r8, r9, sl, fp, ip, lr, pc}
10002f64: e24cb004 sub fp, ip, #4
10002f68: ed2d8b02 vpush {d8}
...
10002f80: ee082a10 vmov s16, r2
...
10002fe0: ee180a10 vmov r0, s16
...
1000308c: ecbc8b02 vldmia ip!, {d8}
...
当我有很多这样的功能时,懒惰的上下文切换是没有意义的.
When I have many of such functions, lazy context switching makes no sense.
有谁知道如何告诉编译器浮点指令只有在c代码中有浮点运算时才应该生成?
Does anybody know how to tell the compiler that floating-point instructions should only be generated when there is a floating point operation in the c code ?
我使用 gcc 9.2.0.浮点选项是:-mhard-float -mfloat-abi=hard -mfpu=vfp
I use gcc 9.2.0. The floating point options are: -mhard-float -mfloat-abi=hard -mfpu=vfp
这是一个示例 c 函数(不可使用,仅演示):
Here is a example c function (not useable, only a demo):
void func(char *a1, char *a2, char *a3);
int bar_1[1], foo_1, foo_2;
void fpu_test() {
int oldest_idx = -1;
while (1) {
int *oldest = (int *)0;
int idx = oldest_idx;
for (int i = 0; i < 3; i++) {
if (++idx >= 3)
idx = 0;
int *lec = &bar_1[idx];
if (*lec) {
if (*lec - *oldest < 0) {
oldest = lec;
oldest_idx = idx;
}
}
}
if (oldest) {
foo_1++;
if (foo_2)
func("1", "2", "3");
}
}
}
gcc 命令行:
$HOME/devel/opt/cross-musl/bin/arm-linux-musleabihf-gcc -O2 -march=armv7-a -mtune=cortex-a9 -mhard-float -mfloat-abi=hard -mfpu=vfp -Wa,-ahlms=fpu_test.lst -mapcs-frame -c fpu_test.c -o fpu_test.o
汇编程序列表:
...
35 0000 0DC0A0E1 mov ip, sp
36 0004 003000E3 movw r3, #:lower16:foo_2
37 0008 F0DF2DE9 push {r4, r5, r6, r7, r8, r9, r10, fp, ip, lr, pc}
38 000c 006000E3 movw r6, #:lower16:foo_1
39 0010 003040E3 movt r3, #:upper16:foo_2
40 0014 04B04CE2 sub fp, ip, #4
41 0018 006040E3 movt r6, #:upper16:foo_1
42 001c 004000E3 movw r4, #:lower16:bar_1
43 0020 028B2DED vpush.64 {d8} <=== this is the problem
...
推荐答案
GCC 有一个命令行开关,-mgeneral-regs-only
..使用命令行开关时,您可能需要将故意使用浮点寄存器或运算的代码分离到单独的源文件中,以便无需该开关即可编译.
GCC has a command-line switch for this, -mgeneral-regs-only
.. When using the command-line switch, you may need to separate code that deliberately uses floating-point registers or operations into separate source files so that it can be compiled without that switch.
截至 GCC 9.3(可能是 9?),对于 ARM 目标,这可用作函数属性:
void MyFunction(char *MyParameter) __attribute__ ((general-regs-only));
将属性放在声明之后是一种较旧的语法,需要非定义声明.测试表明 GCC 现在在声明符之前接受一个属性,并且可以与定义一起使用:
Putting the attribute after the declaration is an older syntax and required a non-definition declaration. Testing suggests GCC now accepts an attribute before the declarator and may be used with a definition:
void __attribute__ ((general-regs-only)) MyFunction(char *MyParameter)
{...}
您也可以使用 __attribute__ ((nogeneral-regs-only))
否定该属性,尽管我没有看到此文档.
You may also be able to negate the attribute with __attribute__ ((nogeneral-regs-only))
, although I do not see this documented.
这也可以通过 pragma.
在 -march
和 -mcpu
开关中也有 +nofp
选项,但我认为 -mgeneral-regs-只有
是你想要的.
There are also +nofp
options within the -march
and -mcpu
switches, but I think -mgeneral-regs-only
is what you want.
这篇关于ARM、VFP、浮点、惰性上下文切换的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!