产生code不匹配,支持扩展ASM预期 [英] Generated code not matching expectations with Extended ASM

查看:162
本文介绍了产生code不匹配,支持扩展ASM预期的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个 CpuFeatures 类。该类的要求很简单:(1)preserve EBX RBX ,和(2)记录在 EAX / EBX / ECX / EDX CPUID 返回的值。我不知道正在生成code是code我意。

I have a CpuFeatures class. The requirements for the class are simple: (1) preserve EBX or RBX, and (2) record the values returned from CPUID in EAX/EBX/ECX/EDX. I'm not sure the code being generated is the code I intended.

CpuFeatures 类code使用GCC扩展ASM。下面是相关的code:

The CpuFeatures class code uses GCC Extended ASM. Here's the relevant code:

struct CPUIDinfo
{
    word32 EAX;
    word32 EBX;
    word32 ECX;
    word32 EDX;
};

bool CpuId(word32 func, word32 subfunc, CPUIDinfo& info)
{
    uintptr_t scratch;

    __asm__ __volatile__ (

        ".att_syntax \n"

#if defined(__x86_64__)
        "\t xchgq %%rbx, %q1 \n"
#else
        "\t xchgl %%ebx, %k1 \n"
#endif

        "\t cpuid \n"

#if defined(__x86_64__)
        "\t xchgq %%rbx, %q1 \n"
#else
        "\t xchgl %%ebx, %k1 \n"
#endif

      : "=a"(info.EAX), "=&r"(scratch), "=c"(info.ECX), "=d"(info.EDX)
      : "a"(func), "c"(subfunc)
    );

    if(func == 0)
        return !!info.EAX;

    return true;
}

在code以下是用 -g3 -oG 编制在Cygwin I386。当我检查它在调试器下,我不喜欢我所看到的。

The code below was compiled with -g3 -Og on Cygwin i386. When I examine it under a debugger, I'm don't like what I am seeing.

Dump of assembler code for function CpuFeatures::DoDetectX86Features():
   ...
   0x0048f355 <+1>:     sub    $0x48,%esp
=> 0x0048f358 <+4>:     mov    $0x0,%ecx
   0x0048f35d <+9>:     mov    %ecx,%eax
   0x0048f35f <+11>:    xchg   %ebx,%ebx
   0x0048f361 <+13>:    cpuid
   0x0048f363 <+15>:    xchg   %ebx,%ebx
   0x0048f365 <+17>:    mov    %eax,0x10(%esp)
   0x0048f369 <+21>:    mov    %ecx,0x18(%esp)
   0x0048f36d <+25>:    mov    %edx,0x1c(%esp)
   0x0048f371 <+29>:    mov    %ebx,0x14(%esp)
   0x0048f375 <+33>:    test   %eax,%eax
   ...

我不喜欢我所看到的,因为它出现 EBX / RBX 是的的是preserved( XCHG%EBX,EBX% +11 )。此外,它看起来像pserved的$ P $ EBX / RBX 被保存为 CPUID ,而不是结果 EBX CPUID返回 XCHG%EBX,%EBX +15 MOV EBX%,0×14(%ESP) +29 )。

I don't like what I am seeing because it appears EBX/RBX is not being preserved (xchg %ebx,%ebx at +11). Additionally, it looks like the preserved EBX/RBX is being saved as the result of CPUID, and not the actual value of EBX returned by CPUID (xchg %ebx,%ebx at +15, before the mov %ebx,0x14(%esp) at +29).

如果我更改了操作使用带有内存运算=&安培; M(从零开始),然后将生成的code是:

If I change the operand to use a memory op with "=&m"(scratch), then the generated code is:

0x0048f35e <+10>:    xchg   %ebx,0x40(%esp)
0x0048f362 <+14>:    cpuid
0x0048f364 <+16>:    xchg   %ebx,0x40(%esp)

一个相关的问题是什么,确保读/写操作数在具有扩展ASM所需的时间发生?

我是什么做错了(除了上应该采取5分钟或15分钟的东西浪费了无数的时间)?

What am I doing wrong (besides wasting countless hours on something that should have taken 5 or 15 minutes)?

推荐答案

在code以下直接是我用来编译上面包括修改交换(SWAP)的例子code一个完整的例子给 info.EBX 变量。

The code below is a complete example that I used to compile your example code above including the modification to exchange(swap) directly to the info.EBX variable.

#include <inttypes.h>
#define word32 uint32_t

struct CPUIDinfo
{
    word32 EAX;
    word32 EBX;
    word32 ECX;
    word32 EDX;
};

bool CpuId(word32 func, word32 subfunc, CPUIDinfo& info)
{
    __asm__ __volatile__ (

        ".att_syntax \n"

#if defined(__x86_64__)
        "\t xchgq %%rbx, %q1 \n"
#else
        "\t xchgl %%ebx, %k1 \n"
#endif

        "\t cpuid \n"

#if defined(__x86_64__)
        "\t xchgq %%rbx, %q1 \n"
#else
        "\t xchgl %%ebx, %k1 \n"
#endif

      : "=a"(info.EAX), "=&m"(info.EBX), "=c"(info.ECX), "=d"(info.EDX)
      : "a"(func), "c"(subfunc)
    );

    if(func == 0)
        return !!info.EAX;

    return true;
}

int main()
{
    CPUIDinfo  cpuInfo;
    CpuId(1, 0, cpuInfo);
}

这是你应该做的第一个观察是,我选择使用info.EBX内存位置做实际的互换。这消除了需要一个又一个临时变量或注册。

The first observation that you should make is that I chose to use the info.EBX memory location to do the actual swap to. This eliminates needing a another temporary variable or register.

我组装成32位code。与 -g3 -oG -S -m32 ,并获得这些利益的说明:

I assembled as 32-bit code with -g3 -Og -S -m32 and got these instructions of interest:

xchgl %ebx, 4(%edi)
cpuid
xchgl %ebx, 4(%edi)

movl    %eax, (%edi)
movl    %ecx, 8(%edi)
movl    %edx, 12(%edi)

%EDI 恰好包含信息结构的地址。 4(%EDI)恰好是 info.EBX 的地址。我们换用%EBX 4(%EDI) CPUID 。该指示 EBX 恢复到它以前 CPUID 4(%EDI) 现在有什么 EBX 是对后 CPUID 被执行死刑。其余的 MOVL 线的地方 EAX ECX EDX 注册到通过信息结构的其余部分%EDI 寄存器。

%edi happens to contain the address of the info structure. 4(%edi) happens to be the address of info.EBX. We swap %ebx and 4(%edi) after cpuid. With that instruction ebx is restored to what it was before cpuid and 4(%edi) now has what ebx was right after cpuid was executed. The remaining movl lines place eax, ecx, edx registers into the rest of the info structure via the %edi register.

生成的code以上就是我希望它是。

The generated code above is what I would expect it to be.

您code与从头变量(并用约束=&安培; M(从零开始))永远不会被汇编模板后使用,因此%EBX,0X40(%ESP)有你想要的价值,但它从来没有被移动的任何地方非常有用。你必须在从头变量复制到 info.EBX (即信息。 EBX =划伤; ),并期待在这一切得到生成的最终指令。在某些时候将数据从从头复制生成的汇编指令中的内存位置 info.EBX

Your code with the scratch variable (and using the constraint "=&m"(scratch)) never gets used after the assembler template so %ebx,0x40(%esp) has the value you want but it never gets moved anywhere useful. You'd have to copy the scratch variable into info.EBX (ie. info.EBX = scratch;)and look at all of the resulting instructions that get generated. At some point the data would be copied from the scratch memory location to info.EBX among the generated assembly instructions.

更新 - 的Cygwin和MinGW

我并不完全满意了Cygwin的code输出是正确的。在半夜我有一个啊哈!时刻。窗户已经做了自己的位置无关code的动态链接装载器装载的图像(DLL等),并通过重新筑底修改图像。有没有需要额外的PIC的处理就像是在Linux的32位共享库,这样做有一个与 EBX / RBX 。这就是为什么Cygwin和MinGW的有 -fPIC

I wasn't entirely satisfied that the Cygwin code output was correct. In the middle of the night I had an Aha! moment. Windows already does its own position independent code when the dynamic link loader loads an image (DLL etc) and modifies the image via re-basing. There is no need for additional PIC processing like it is done in Linux 32 bit shared libraries so there is no issue with ebx/rbx. This is why Cygwin and MinGW will present warnings like this when compiling with -fPIC

警告:-fPIC忽略目标(所有code是与位置无关)

warning: -fPIC ignored for target (all code is position independent)

这是因为Windows下的所有32位code可以重新基础,当它是由Windows动态加载器加载。更多关于重新筑底可以在此博士发现多布斯文章。在Windows移植可执行格式(PE)的信息可以在此维基文章被发现。 Cygwin和MinGW的不必担心preserving EBX / RBX 目标32位code时因为他们的平台上PIC已经由操作系统,等重新立足的工具和连接处理。

This is because under Windows all 32bit code can be re-based when it is loaded by the Windows dynamic loader. More about re-basing can be found in this Dr. Dobbs article. Information on the windows Portable Executable format (PE) can be found in this Wiki article. Cygwin and MinGW don't need to worry about preserving ebx/rbx when targeting 32bit code because on their platforms PIC is already handled by the OS, other re-basing tools, and the linker.

这篇关于产生code不匹配,支持扩展ASM预期的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆