使用SIGILL进行AVX功能检测与CPU探测 [英] AVX feature detection using SIGILL versus CPU probing

查看:80
本文介绍了使用SIGILL进行AVX功能检测与CPU探测的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试确定一种有效的方法来检测AVX和AVX2在Intel和AMD处理器上的可用性.当我阅读《英特尔软件开发人员手册》 ,第I卷(使用XSAVE功能集管理状态,第310页)时,我惊讶地发现它更接近SSE和XSAVE. ).

I'm trying to determine an efficient method for detecting the availability of AVX and AVX2 on Intel and AMD processors. I was kind of surprised to learn it was closer to SSE and XSAVE when reading the Intel Software Developer Manual, Volume I (MANAGING STATE USING THE XSAVE FEATURE SET, p. 310).

Intel在是否已启用AVX?代码如下所示,并且不太麻烦.问题是,Visual Studio是一个痛点,因为我们需要将代码从C/C ++文件ind移到X64的ASM文件中.

Intel posts some code for detecting AVX availability at Is AVX enabled? The code is shown below and its not too painful. The problem is, Visual Studio is a pain point because we need to move code out of C/C++ files ind into ASM files for X64.

其他人似乎正在采用SIGILL方法来检测AVX的可用性.或者他们不知不觉中使用了SIGILL方法.参见例如关于AVX指令的SIGILL .

Others seem to be taking the SIGILL approach to detecting AVX availability. Or they are unwittingly using the SIGILL method. See, for example, SIGILL on AVX instruction.

我的问题是,使用SIGILL方法检测AVX可用性是否安全?在这里,"<安全>"是指当CPU和OS支持AVX时,AVX指令不会生成SIGILL.否则将生成SIGILL.

My question is, is it safe to use the SIGILL method to detect AVX availability? Here, "safe" means an AVX instruction will not generate a SIGILL when the CPU and OS supports AVX; and it will generate a SIGILL otherwise.

以下代码适用于32位计算机,其代码来自英特尔博客

The code below is for 32-bit machines and its from the Intel blog Is AVX enabled? The thing that worries me is manipulating the control registers. Reading and writing some X86 and ARM control registers sometimes require super user/administrator privileges. Its the reason I prefer a SIGILL (and avoid control registers).

; int isAvxSupported();
isAvxSupported proc

  xor eax, eax
  cpuid
  cmp eax, 1           ; does CPUID support eax = 1?
  jb not_supported

  mov eax, 1
  cpuid
  and ecx, 018000000h  ; check 27 bit (OS uses XSAVE/XRSTOR)
  cmp ecx, 018000000h  ; and 28       (AVX supported by CPU)
  jne not_supported

  xor ecx, ecx         ; XFEATURE_ENABLED_MASK/XCR0 register number = 0
  xgetbv               ; XFEATURE_ENABLED_MASK register is in edx:eax
  and eax, 110b
  cmp eax, 110b        ; check the AVX registers restore at context switch
  jne not_supported

supported:
  mov eax, 1
  ret

not_supported:
  xor eax, eax
  ret

isAvxSupported endp

推荐答案

先讲一点理论.

要使用AVX指令集,必须满足一些条件:

In order to use the AVX instructions set a few conditions must meet:

  1. CR4.OSXSAVE[bit 18]必须为1.
    操作系统设置此标志,以向处理器发出信号,通知它支持xsave扩展.
    xsave扩展名是保存AVX状态的唯一方法(fxsave不保存ymm寄存器),因此OS必须支持它们.

  1. CR4.OSXSAVE[bit 18] must be 1.
    This flag is set by the OS to signal the processor that it supports the xsave extensions.
    The xsave extensions are the only way to save the AVX state (fxsave doesn't save the ymm registers) and thus the OS must support them.

XCR0.SSE[bit 1]XCR0.AVX[bit 2]必须为1.
这些标志由操作系统设置,以通知处理器它支持保存和还原SSE和AVX状态(通过xsave).

XCR0.SSE[bit 1] and XCR0.AVX[bit 2] must be 1.
These flags are set by the OS to signal the processor that it supports saving and restoring the SSE and AVX states (through xsave).

CPUID.1:ECX.AVX[bit 28] = 1
当然,处理器首先必须支持AVX扩展.

CPUID.1:ECX.AVX[bit 28] = 1
Of course, the processor must support the AVX extensions in the first place.

所有这些寄存器都是用户模式可读的,但对于CR4.
幸运的是,CR4.OSXSAVE位反映在CPUID.1:ECX.OSXSAVE[bit 27]中,因此所有信息均可通过用户模式访问. 不涉及特权指令.

All these registers are user-mode readable but for CR4.
Fortunately, the bit CR4.OSXSAVE is reflected in CPUID.1:ECX.OSXSAVE[bit 27] and thus all information is user-mode accessible. No privileged instructions are involved.

要使用AVX扩展,必须同时支持硬件(CPUID.1:ECX.AVXCPUID.1:ECX.XSAVE)和操作系统(CPUID.1:ECX.OSXSAVEXCR0.SSEXCR0.AVX).
由于OS仅在存在硬件支持的情况下才发出对xsave的支持信号,因此测试前者就足够了.
对于AVX扩展,仍建议测试CPUID.1:ECX.AVX,因为即使不支持AVX,操作系统也可能设置XCR0.AVX.

In order to use the AVX extensions both hardware (CPUID.1:ECX.AVX and CPUID.1:ECX.XSAVE) and OS (CPUID.1:ECX.OSXSAVE, XCR0.SSE and XCR0.AVX) support must be present.
Since the OS signals its support for xsave only in presence of the hardware support, testing the former is enough.
For the AVX extensions, testing CPUID.1:ECX.AVX is still recommended as the OS may set XCR0.AVX even if AVX is not supported.

这导致了英特尔官方的,强烈推荐的算法:

This leads to the Intel official, and strongly recommended, algorithm:

与您发布的完全相同.

which is the exact same one you posted.

捕获异常以检测对AVX扩展的支持也可以确保您可以确保捕获的异常为 #UD .
例如,通过执行vzeroall,唯一可能的例外是 #UD #NM .
仅在以下情况下抛出第一个:

Catching exceptions to detect the support for the AVX extensions will also do granted that you can guarantee that the exception caught is #UD.
For example, by executing vzeroall the only possible exceptions are #UD and #NM.
The first one is thrown only when:

如果XCR0 [2:1]≠"11b".
如果CR4.OSXSAVE [bit 18] = 0.
如果CPUID.01H.ECX.AVX [bit 28] = 0.
如果VEX.vvvv≠1111B.

If XCR0[2:1] ≠ ‘11b’.
If CR4.OSXSAVE[bit 18]=0.
If CPUID.01H.ECX.AVX[bit 28]=0.
If VEX.vvvv ≠ 1111B.

因此,除非您的汇编器/编译器损坏,否则它与开始时所述的条件完全相同.

So unless you have a broken assembler/compiler, it is exactly equivalent of the conditions stated at the beginning.

后者是为了保存AVX状态而进行的优化,因此,操作系统不会将其暴露给用户模式程序.

The latter is thrown as an optimisation for saving the AVX state and as such, it is not exposed to user-mode programs by the OS.

因此也可以在vzeroall或类似位置捕获SIGILL.

Thereby catching SIGILL on vzeroall or similar would also do.

这篇关于使用SIGILL进行AVX功能检测与CPU探测的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆