动态确定恶意AVX-512指令在哪里执行 [英] Dynamically determining where a rogue AVX-512 instruction is executing
问题描述
我有一个运行在支持AVX-512的Intel机器上的进程,但是该进程不直接使用任何AVX-512指令(asm或内在函数),而是使用-mno-avx512f
进行编译,因此编译器不插入所有AVX-512指令.
I have a process running on an Intel machine that supports AVX-512, but this process doesn't directly use any AVX-512 instructions (asm or intrinsics) and is compiled with -mno-avx512f
so that the compiler doesn't insert any AVX-512 instructions.
但是,它在降低的AVX Turbo频率下无限期运行.毫无疑问,有一个AVX-512指令通过库,(不太可能)系统调用或类似内容潜入某个地方.
Yet, it is running indefinitely at the reduced AVX turbo frequency. No doubt there is an AVX-512 instruction sneaking in somewhere, via a library, (very unlikely) system call or something like that.
除了尝试在AVX-512指令来自何处进行二进制搜索"之外,我还可以通过某种方式立即找到它,例如,捕获此类指令吗?
Rather than try to "binary search" down where the AVX-512 instruction is coming from, is there some way I can find it immediately, e.g., trapping on such an instruction?
操作系统是Ubuntu 16.04.
OS is Ubuntu 16.04.
推荐答案
如注释中所述,您可以搜索系统中的所有ELF文件并反汇编它们,以检查它们是否使用AVX-512指令:
As suggested in comments, you may search all ELF files of your system and disassemble them in order to check if they use AVX-512 instructions:
$ objdump -d /lib64/ld-linux-x86-64.so.2 | grep %zmm0
14922: 62 f1 fd 48 7f 44 24 vmovdqa64 %zmm0,0xc0(%rsp)
14a2d: 62 f1 fd 48 6f 44 24 vmovdqa64 0xc0(%rsp),%zmm0
14c2c: 62 f1 fd 48 7f 81 50 vmovdqa64 %zmm0,0x50(%rcx)
14ca0: 62 f1 fd 48 6f 84 24 vmovdqa64 0x50(%rsp),%zmm0
(顺便说一句,BTW,libc和ld.so确实包含AVX-512指令,不是您要的指令吗?)
(BTW, libc and ld.so do include AVX-512 instructions, they are not the ones you are looking for?)
但是,您可能会发现甚至无法执行的二进制文件以及动态未压缩的未命中的代码,等等.
However, you may find binary that you do not even execute and miss code dynamically uncompressed, etc...
如果您对进程有疑问(因为perf
报告CORE_POWER.LVL*_TURBO_LICENSE
事件),我建议在此过程中生成核心转储并将其反汇编(注意第一行还允许转储代码):
If you have a doubt on process (because perf
report CORE_POWER.LVL*_TURBO_LICENSE
events), I suggest to generate a core-dump if this process and disassemble it (notice first line allows to also dump code):
$ echo 0xFF > /proc/<PID>/coredump_filter
$ gdb --pid=<PID>
[...]
(gdb) gcore
Saved corefile core.19602
(gdb) quit
Detaching from program: ..., process ...
$ objdump -d core.19602 | grep %zmm0
7f73db8187cb: 62 f1 7c 48 10 06 vmovups (%rsi),%zmm0
7f73db818802: 62 f1 7c 48 11 07 vmovups %zmm0,(%rdi)
7f73db81883f: 62 f1 7c 48 10 06 vmovups (%rsi),%zmm0
[...]
接下来,您可以轻松编写一个小型python脚本,在每条AVX-512指令上添加一个断点(或跟踪点).像
Next, you can easily write a small python script to add a breakpoint (or a tracepoint) on every AVX-512 instructions. Something like
(gdb) python
>import os
>with os.popen('objdump -d core.19602 | grep %zmm0 | cut -f1 -d:') as pipe:
> for line in pipe:
> gdb.Breakpoint("*" + line)
确定会创建数百个(或数千个)断点.但是,断点的开销对于gdb来说足够小(我认为每个断点< 1kB).
Sure it will create multiple hundreds (or thousands) of breakpoints. However, overhead of a breakpoint is small enough for gdb to support that (I think <1kB for each breakpoint).
另一种方法是在虚拟机中运行代码.特别是,我建议使用libvex. libvex用于动态检测代码(内存泄漏,内存配置文件等). libvex解释机器代码,将其转换为中间表示形式,然后重新编码机器代码以供CPU执行.使用libvex的最著名的项目是valgrind(公平地说,libvex是valgrind的后端).
One another way would be to run your code in a a virtual machine. Especially, I suggest libvex. libvex is used to dynamically instrument code (memory leak, memory profiling, etc..). libvex interpret machine code, translate it to an intermediate representation and re-encode machine code for CPU execution. The most famous project using libvex is valgrind (to be fair, libvex is back-end of valgrind).
因此,您可以使用libvex运行应用程序,而无需进行任何操作:
Therefore, you can run your application with libvex without any instrumentation with:
$ valgrind --tool=none YOUR_APP
现在,您必须围绕libvex编写工具,以检测AVX-512的使用情况.但是,libVEX尚不支持AVX-512.因此,一旦必须执行AVX-512指令,它就会因一条非法指令而失败.
Now you have to write a tool around libvex in order to detect AVX-512 usage. However, libVEX does NOT (yet) support AVX-512. So, as soon as it have to execute AVX-512 instruction, it will fail with an Illegal instruction.
$ valgrind --tool=none YOUR_APP
[...]
vex amd64->IR: unhandled instruction bytes: 0x62 0xF1 0xFD 0x48 0x28 0x84 0x24 0x8 0x1 0x0
vex amd64->IR: REX=0 REX.W=0 REX.R=0 REX.X=0 REX.B=0
vex amd64->IR: VEX=0 VEX.L=0 VEX.nVVVV=0x0 ESC=NONE
vex amd64->IR: PFX.66=0 PFX.F2=0 PFX.F3=0
==20061== valgrind: Unrecognised instruction at address 0x10913e.
==20061== at 0x10913E: main (in ...)
==20061== Your program just tried to execute an instruction that Valgrind
==20061== did not recognise. There are two possible reasons for this.
==20061== 1. Your program has a bug and erroneously jumped to a non-code
==20061== location. If you are running Memcheck and you just saw a
==20061== warning about a bad jump, it's probably your program's fault.
==20061== 2. The instruction is legitimate but Valgrind doesn't handle it,
==20061== i.e. it's Valgrind's fault. If you think this is the case or
==20061== you are not sure, please let us know and we'll try to fix it.
==20061== Either way, Valgrind will now raise a SIGILL signal which will
==20061== probably kill your program.
==20061==
==20061== Process terminating with default action of signal 4 (SIGILL)
==20061== Illegal opcode at address 0x10913E
==20061== at 0x10913E: main (in ...)
==20061==
注意:此答案已经过测试:
Note: this answer has been tested with:
#include <immintrin.h>
int main(int argc, char *argv[]) {
__m512d a, b, c;
_mm512_fnmadd_pd(a, b, c);
}
这篇关于动态确定恶意AVX-512指令在哪里执行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!