如何从内联汇编访问C结构/变量? [英] How to access C struct/variables from inline asm?

查看:3432
本文介绍了如何从内联汇编访问C结构/变量?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

考虑以下code:

    int bn_div(bn_t *bn1, bn_t *bn2, bn_t *bnr)
  {
    uint32 q, m;        /* Division Result */
    uint32 i;           /* Loop Counter */
    uint32 j;           /* Loop Counter */

    /* Check Input */
    if (bn1 == NULL) return(EFAULT);
    if (bn1->dat == NULL) return(EFAULT);
    if (bn2 == NULL) return(EFAULT);
    if (bn2->dat == NULL) return(EFAULT);
    if (bnr == NULL) return(EFAULT);
    if (bnr->dat == NULL) return(EFAULT);


    #if defined(__i386__) || defined(__amd64__)
    __asm__ (".intel_syntax noprefix");
    __asm__ ("pushl %eax");
    __asm__ ("pushl %edx");
    __asm__ ("pushf");
    __asm__ ("movl %eax, (bn1->dat[i])");
    __asm__ ("xorl %edx, %edx");
    __asm__ ("divl (bn2->dat[j])");
    __asm__ ("movl (q), %eax");
    __asm__ ("movl (m), %edx");
    __asm__ ("popf");
    __asm__ ("popl %edx");
    __asm__ ("popl %eax");
    #else
    q = bn->dat[i] / bn->dat[j];
    m = bn->dat[i] % bn->dat[j];
    #endif
    /* Return */
    return(0);
  }

数据类型UINT32基本上是一个unsigned long int类型或uint32_t的32位无符号整数。类型bnint或者是一个无符号短整数(uint16_t)或者取决于如果64位数据类型可用与否一个uint32_t的。如果64位可用,则bnint是UINT32,否则它是一个UINT16。这是为了捕捉在code的其他部分进位/溢出完成的。结构bn_t定义如下:

The data types uint32 is basically an unsigned long int or a uint32_t unsigned 32-bit integer. The type bnint is either a unsigned short int (uint16_t) or a uint32_t depending on if 64-bit data types are available or not. If 64-bit is available, then bnint is a uint32, otherwise it's a uint16. This was done in order to capture carry/overflow in other parts of the code. The structure bn_t is defined as follows:

typedef struct bn_data_t bn_t;
struct bn_data_t
  {
    uint32 sz1;         /* Bit Size */
    uint32 sz8;         /* Byte Size */
    uint32 szw;         /* Word Count */
    bnint *dat;         /* Data Array */
    uint32 flags;       /* Operational Flags */
  };

功能上线300在我的源代码code开头。所以,当我尝试编译/使我得到以下错误:

The function starts on line 300 in my source code. So when I try to compile/make it, I get the following errors:

system:/home/user/c/m3/bn 1036 $$$ ->make
clang -I. -I/home/user/c/m3/bn/.. -I/home/user/c/m3/bn/../include  -std=c99 -pedantic -Wall -Wextra -Wshadow -Wpointer-arith -Wcast-align -Wstrict-prototypes  -Wmissing-prototypes -Wnested-externs -Wwrite-strings -Wfloat-equal  -Winline -Wunknown-pragmas -Wundef -Wendif-labels  -c /home/user/c/m3/bn/bn.c
/home/user/c/m3/bn/bn.c:302:12: warning: unused variable 'q' [-Wunused-variable]
    uint32 q, m;        /* Division Result */
           ^
/home/user/c/m3/bn/bn.c:302:15: warning: unused variable 'm' [-Wunused-variable]
    uint32 q, m;        /* Division Result */
              ^
/home/user/c/m3/bn/bn.c:303:12: warning: unused variable 'i' [-Wunused-variable]
    uint32 i;           /* Loop Counter */
           ^
/home/user/c/m3/bn/bn.c:304:12: warning: unused variable 'j' [-Wunused-variable]
    uint32 j;           /* Loop Counter */
           ^
/home/user/c/m3/bn/bn.c:320:14: error: unknown token in expression
    __asm__ ("movl %eax, (bn1->dat[i])");
             ^
<inline asm>:1:18: note: instantiated into assembly here
        movl %eax, (bn1->dat[i])
                        ^
/home/user/c/m3/bn/bn.c:322:14: error: unknown token in expression
    __asm__ ("divl (bn2->dat[j])");
             ^
<inline asm>:1:12: note: instantiated into assembly here
        divl (bn2->dat[j])
                  ^
4 warnings and 2 errors generated.
*** [bn.o] Error code 1

Stop in /home/user/c/m3/bn.
system:/home/user/c/m3/bn 1037 $$$ ->

我所知道的:

我认为自己是相当不错的x86汇编精通(从code,我上面写的证明)。不过,最后一次,我混了高级语言和汇编程序被使用Borland帕斯卡尔大约15 - 20年前写的游戏显卡驱动(pre-95的Windows时代)时。我熟悉是与英特尔的语法。

I consider myself to be fairly well versed in x86 assembler (as evidenced from the code that I wrote above). However, the last time that I mixed a high level language and assembler was using Borland Pascal about 15-20 years ago when writing graphics drivers for games (pre-Windows 95 era). My familiarity is with Intel syntax.

我不知道的:

如何从ASM访问bn_t(尤其是* DAT)的成员?由于* DAT是UINT32一个指针,我访问元素的数组(如bn1-> DAT [I])。

How do I access members of bn_t (especially *dat) from asm? Since *dat is a pointer to uint32, I am accessing the elements as an array (eg. bn1->dat[i]).

我如何访问在栈上声明的局部变量?

How do I access local variables that are declared on the stack?

我使用推/流行到被破坏寄存器恢复到previous值,从而没有扰乱编译器。不过,我还需要包括局部变量volatile关键字呢?

I am using push/pop to restore clobbered registers to their previous values so as to not upset the compiler. However, do I also need to include the volatile keyword on the local variables as well?

或者,有没有我不知道的更好的办法?我不想把它放到一个单独的函数调用,因为调用开销,因为这个功能是性能的关键。

Or, is there a better way that I am not aware of? I don't want to put this in a separate function call because of the calling overhead as this function is performance critical.

其他:

现在,我刚开始所以它是没有完成的地方写这个功能。有缺失的环和其他此类支持/胶水code。但是,主要的主旨访问本地变量/结构元素

Right now, I'm just starting to write this function so it is no where complete. There are missing loops and other such support/glue code. But, the main gist is accessing local variables/structure elements.

编辑1:

这是我使用的语法似乎是铛支持的唯一的一个。我尝试以下code和铛给我的种种错误的:

The syntax that I am using seems to be the only one that clang supports. I tried the following code and clang gave me all sorts of errors:

__asm__ ("pushl %%eax",
    "pushl %%edx",
    "pushf",
    "movl (bn1->dat[i]), %%eax",
    "xorl %%edx, %%edx",
    "divl ($0x0c + bn2 + j)",
    "movl %%eax, (q)",
    "movl %%edx, (m)",
    "popf",
    "popl %%edx",
    "popl %%eax"
    );

这希望我把一个右括号在第一行,替换逗号。我切换到,因为我读的地方,内联汇编需要%%表示CPU寄存器使用%%%,而是和我正在使用无效的转义序列铛告诉我。

It wants me to put a closing parenthesis on the first line, replacing the comma. I switched to using %% instead of % because I read somewhere that inline assembly requires %% to denote CPU registers, and clang was telling me that I was using an invalid escape sequence.

推荐答案

如果你只需要32B / 32B => 32位除法,让编译器使用两个输出 DIV ,其中GCC,铛和ICC都做得很好,因为你可以在<一见href=\"https://gcc.godbolt.org/#compilers:!((compiler:g530,options:'-Wall+-O3+-fno-fast-math+-march%3Dnative+-mtune%3Dnative++-fverbose-asm+-mno-avx',sourcez:PTAEEsDsBsoU1AQwM4FtQDc4CdngPaTJIBmALjqGQBYIAOArtnALQDCocAHoqndHGQAoIQGIoAY2gMAJggA8yMjKhkAdNQB8IpdgYSyoEvnygA3qAZFwAc0hwZESIeP4ARomwAaUDMRkAbQBGAAYAXQBuUABfUAAqN0ggnwTIACYIkRBQACEAQQARIQZVAGY0gH1DGXwKlQwKtgBlAFESRHBoAAoS53Kq%2BNQfKzw7BydDcB9VUAArAEpzIVAVwdAAXlBEoJZNP0DwMNAAUi303f2A2cjV29BsmnBiJXxmXzgSOH9iZpaALlA6FQtmohjc9HwqigNgmpm2F38ASOr1AyHwqDgj0gNmWoGYZCYkDOOz2iMO9zOaQRgWumWiWTAAHEAPLMoq9Mj9aq1eoVSD4FCoHplSqGOJDSzWMaOGZTCZzRZmXHZF5vMimRCgaD4CSIaBGcC4Mg%2BADu1HAEmooF1kAA5IY9eAUEhIABPLE4lYcrmgACOG2J1IC5LAiSppJpkVxd1W4oD8IjwaOpzDQdpMZWuPxhL9dIZoBZbOKIoGNTq4Aa/MFVT4GE8yGFfVFg2GUvsMucEGmnYWSxW2W1iEcJGw6NANm1HmgxFUpm1uv1FD4r084EEyrAZotVpoCAk6LonUoAGt%2BSaiTb7UhYM7EG6Pbjvc2yHx40kg4dMl6S4YX3Q0psqaJrSIgrHGAZUK%2Bpx/hkWaYjmf4UjBeZCMAcToRhmFYZhoAAJIAHIADIES0oB5E0ACy8TYTRtFYcAxZNqWPIVhUgqNpyzbivirajO28pyjMCxCEq35MYYvo%2BKgX5IGgoBdAARPU%2BrHAEYb7BU1wADqQAp0Z3ACCnrIgCnyb68w%2BEZMimV0qCLNkzDIAw0BkMQngIFAnCIFwPgOFweJwDYTwULg%2Bm3IZ1nySEFkZrc2QAF44KYflGCiDDID4/BfMgcBhXcCkmfJCaXIcMXZAVpkYnexAtHkAAaFIAEr1XltxqekGnXKACmoNgNlAZc1z2WAzBBUolAohiqCvK6Pj7nwR7YNa1AmDlxBbpaEDENNbwfCQFprs4rWgH8eXzDJ2RSHeMI1IISBbEOczuKAtDMD4d6OIOMjEANiLXF4G5ULQRLIIe0DThAv6mLuqJkIgEjHqlS31KALAACxdMcuB0PMuLcZiAbSXBBLYESvoobo%2BjVKxjnOYYFhPgMkl4lE0SZEIlMGL4NOCHTvgsQ0nNkNmpNdKoABsaMDIJziSwMwmifcwCMxJUkyUL3MNLTLkBWQMmCvJSkVip7VpJ1YQ6XpsWGcZNn4mo5mWeskVdPb2CLKsDm8y5blvJ5Xw%2BZwMj%2BaNwU4MI1vdS70U%2BLFcdxWAiWjkH/nGEt6WZQIKC5fHFVFe%2BialbH8flYVVVEKAtUNWAzV1cdKym%2Bb3XYKg/XnMBYTDQFY0haAk1wDts3Wgei22sQlqQhIOcZgC2T8t3YdLVI7jgrg72QI4JoIDUdqGMwQ7ACa2DgBQ3VTTNpn4LuS00Hefd0DgH0R6s50biLRIWMz6Bs4DNgSBIQM4BEjhseO6uo3j4BIKAOg8NjzQi8v5D6KdYR4m8kjXWhJAZmldFtSG0DYHjBhsgXgCATSICHnALARIzRAL9C6Rw6B3IBSwLgBwAB%2BYmOZ8QoVQmAAQhg/4AMQBgSEjg%2BJ9yWknZKXAKAb3geAehcwVhuBwelaEvDAGcB4FzNOqB/BA30cgBcDgWBQDkIgmQMhHJ4GxICfAchiDpVyirfm5YGjZVdBUGB2BeANhGLYfi2pbHil4gE8YQSYQyzIHLQwCtHw/joYBAuJUjihnboNKMYF0BJJJCkk4lI0yZIwaTXMQh6RAA%3D)),filterAsm:(commentOnly:!t,directives:!t,labels:!t),version:3\"相对=nofollow> Godbolt编译探险:

If you only need 32b / 32b => 32bit division, let the compiler use both outputs of div, which gcc, clang and icc all do just fine, as you can see on the Godbolt compiler explorer:

uint32_t q = bn1->dat[i] / bn2->dat[j];
uint32_t m = bn1->dat[i] % bn2->dat[j];

编译器是相当不错的 CSE ING是在一个 DIV 。只要确保你没有保存除法结果某处GCC不能证明不会影响其余的输入。

Compilers are quite good at CSEing that into one div. Just make sure you don't store the division result somewhere that gcc can't prove won't affect the input of the remainder.

例如。 * M = DAT [I] / DAT [J] 可能会重叠(化名) DAT [I] DAT [J] ,所以GCC将不得不重新加载操作数和重做 DIV 操作。见godbolt链接坏/好例子。

e.g. *m = dat[i] / dat[j] might overlap (alias) dat[i] or dat[j], so gcc would have to reload the operands and redo the div for the % operation. See the godbolt link for bad/good examples.

使用内联汇编的32位/ 32位= 32位的div不会获得你什么,实际上使更糟code铿锵(见godbolt链接)。

Using inline asm for 32bit / 32bit = 32bit div doesn't gain you anything, and actually makes worse code with clang (see the godbolt link).

如果您需要64位/ 32位的32位=,则可能需要ASM,不过,如果没有内置它的编译器。 (GNU C没有之一,AFAICT)。在C明显的方式(铸造操作数 uint64_t中)生成调用64位/ 64位= 64位libgcc的功能,它设有分公司和多个 DIV 的说明。 GCC是不擅长证明的结果将适合在32位,因此单个 DIV 指令不会引起 #DE

If you need 64bit / 32bit = 32bit, you probably need asm, though, if there isn't a compiler built-in for it. (GNU C doesn't have one, AFAICT). The obvious way in C (casting operands to uint64_t) generates a call to a 64bit/64bit = 64bit libgcc function, which has branches and multiple div instructions. gcc isn't good at proving the result will fit in 32bits, so a single div instruction don't cause a #DE.

对于很多其他指令,你可以避免写内联汇编一个的很多的时候用的内置的东西像popcount 功能。随着 -mpopcnt ,它编译到 POPCNT 指令(和占在输出操作数的假依赖英特尔CPU有)。如果没有,它编译于libgcc中的函数调用。

For a lot of other instructions, you can avoid writing inline asm a lot of the time with builtin functions for things like popcount. With -mpopcnt, it compiles to the popcnt instruction (and accounts for the false-dependency on the output operand that Intel CPUs have.) Without, it compiles to a libgcc function call.

始终preFER内建命令,或者编译好的ASM纯C,所以编译器知道什么code确实。当内联,使一些在编译时已知的参数,纯C可以优化掉或简化 ,但使用内联汇编只会加载常数到寄存器,并做了 DIV 在运行时,code。内联汇编也差不多计算之间击败CSE在相同的数据,当然不能自动向量化。

Always prefer builtins, or pure C that compiles to good asm, so the compiler knows what the code does. When inlining makes some of the arguments known at compile-time, pure C can be optimized away or simplified, but code using inline asm will just load constants into registers and do a div at run-time. Inline asm also defeats CSE between similar computations on the same data, and of course can't auto-vectorize.

https://gcc.gnu.org/onlinedocs/gcc/Extended- Asm.html 解释如何告诉你,在寄存器中想要的变量汇编,什么输出。

https://gcc.gnu.org/onlinedocs/gcc/Extended-Asm.html explains how to tell the assembler which variables you want in registers, and what the outputs are.

您可以使用英特尔/ MASM的语法和助记符,以及非注册%名字,如果你喜欢,$通过 -masm =英特尔编译p $ pferably。在AT&amp; T公司的语法错误( 一个fsub fsubr 助记符逆转)可能仍然在英特尔的语法模式present;我忘记了。

You can use Intel/MASM-like syntax and mnemonics, and non-% register names if you like, preferably by compiling with -masm=intel. The AT&T syntax bug (fsub and fsubr mnemonics are reversed) might still be present in intel-syntax mode; I forget.

这是使用GNU C内联汇编使用AT&安培大多数软件项目; T语法仅

Most software projects that use GNU C inline asm use AT&T syntax only.

又见<一个href=\"http://stackoverflow.com/questions/34520013/using-base-pointer-register-in-c-inline-asm/34522750#34522750\">the更多的GNU C内联汇编信息这个答案的底部,的 86 标记维基。

See also the bottom of this answer for more GNU C inline asm info, and the x86 tag wiki.

这是 ASM 语句采用的有一个的字符串Arg和3套限制。最简单的办法让它多行是通过使每个ASM符合 \\ n 结尾的字符串分开,并让编译器隐式将它们连接起来。

An asm statement takes one string arg, and 3 sets of constraints. The easiest way to make it multi-line is by making each asm line a separate string ending with \n, and let the compiler implicitly concatenate them.

另外,你能告诉你想在其中的东西注册编译器。然后,如果变量已经在寄存器中,编译器不具有溢出他们和你加载和存储。这样做真的会搬起石头砸自己的脚。该教程布雷特·黑尔在评论中联希望涵盖这一切。

Also, you tell the compiler which registers you want stuff in. Then if variables are already in registers, the compiler doesn't have to spill them and have you load and store them. Doing that would really shoot yourself in the foot. The tutorial Brett Hale linked in comments hopefully covers all this.

您可以看到编译器输出的汇编本的<一个href=\"https://gcc.godbolt.org/#compilers:!((compiler:g530,options:'-Wall+-O3+-fno-fast-math+-march%3Dnative+-mtune%3Dnative++-fverbose-asm+-mno-avx',sourcez:PTAEEsDsBsoU1AQwM4FtQDc4CdngPaTJIBmALjqGQBYIAOArtnALQDCocAHoqndHGQAoIQGIoAY2gMAJggA8yMjKhkAdNQB8IpdgYSyoEvnygA3qAZFwAc0hwZESIeP4ARomwAaUDMRkAbQBGAAYAXQBuUABfUAAqN0ggnwTIACYIkRBQACEAQQARIQZVAGY0gH1DGXwKlQwKtgBlAFESRHBoAAoS53Kq%2BNQfKzw7BydDcB9VUAArAEpzIVAVwdAAXlBEoJZNP0DwMNAAUi303f2A2cjV29BsmnBiJXxmXzgSOH9iZpaALlA6FQtmohjc9HwqigNgmpm2F38ASOr1AyHwqDgj0gNmWoGYZCYkDOOz2iMO9zOaQRgWumWiWTAAHEAPLMoq9Mj9aq1eoVSD4FCoHplSqGOJDSzWMaOGZTCZzRZmXHZF5vMimRCgaD4CSIaBGcC4Mg%2BADu1HAEmooF1kAA5IY9eAUEhIABPLE4lYcrmgACOG2J1IC5LAiSppJpkVxd1W4oD8IjwaOpzDQdpMZWuPxhL9dIZoBZbOKIoGNTq4Aa/MFVT4GE8yGFfVFg2GUvsMucEGmnYWSxW2W1iEcJGw6NANm1HmgxFUpm1uv1FD4r084EEyrAZotVpoCAk6LonUoAGt%2BSaiTb7UhYM7EG6Pbjvc2yHx40kg4dMl6S4YX3Q0psqaJrSIgrHGAZUK%2Bpx/hkWaYjmf4UjBeZCMAcToRhmFYZhoAAJIAHIADIES0oB5E0ACy8TYTRtFYcAxZNqWPIVhUgqNpyzbivirajO28pyjMCxCEq35MYYvo%2BKgX5IGgoBdAARPU%2BrHAEYb7BU1wADqQAp0Z3ACCnrIgCnyb68w%2BEZMimV0qCLNkzDIAw0BkMQngIFAnCIFwPgOFweJwDYTwULg%2Bm3IZ1nySEFkZrc2QAF44KYflGCiDDID4/BfMgcBhXcCkmfJCaXIcMXZAVpkYnexAtHkAAaFIAEr1XltxqekGnXKACmoNgNlAZc1z2WAzBBUolAohiqCvK6Pj7nwR7YNa1AmDlxBbpaEDENNbwfCQFprs4rWgH8eXzDJ2RSHeMI1IISBbEOczuKAtDMD4d6OIOMjEANiLXF4G5ULQRLIIe0DThAv6mLuqJkIgEjHqlS31KALAACxdMcuB0PMuLcZiAbSXBBLYESvoobo%2BjVKxjnOYYFhPgMkl4lE0SZEIlMGL4NOCHTvgsQ0nNkNmpNdKoABsaMDIJziSwMwmifcwCMxJUkyUL3MNLTLkBWQMmCvJSkVip7VpJ1YQ6XpsWGcZNn4mo5mWeskVdPb2CLKsDm8y5blvJ5Xw%2BZwMj%2BaNwU4MI1vdS70U%2BLFcdxWAiWjkH/nGEt6WZQIKC5fHFVFe%2BialbH8flYVVVEKAtUNWAzV1cdKym%2Bb3XYKg/XnMBYTDQFY0haAk1wDts3Wgei22sQlqQhIOcZgC2T8t3YdLVI7jgrg72QI4JoIDUdqGMwQ7ACa2DgBQ3VTTNpn4LuS00Hefd0DgH0R6s50biLRIWMz6Bs4DNgSBIQM4BEjhseO6uo3j4BIKAOg8NjzQi8v5D6KdYR4m8kjXWhJAZmldFtSG0DYHjBhsgXgCATSICHnALARIzRAL9C6Rw6B3IBSwLgBwAB%2BYmOZ8QoVQmAAQhg/4AMQBgSEjg%2BJ9yWknZKXAKAb3geAehcwVhuBwelaEvDAGcB4FzNOqB/BA30cgBcDgWBQDkIgmQMhHJ4GxICfAchiDpVyirfm5YGjZVdBUGB2BeANhGLYfi2pbHil4gE8YQSYQyzIHLQwCtHw/joYBAuJUjihnboNKMYF0BJJJCkk4lI0yZIwaTXMQh6RAA%3D)),filterAsm:(commentOnly:!t,directives:!t,labels:!t),version:3\"相对=nofollow> godbolt 。

You can see the compiler asm output for this on godbolt.

uint32_t q, m;  // this is unsigned int on every compiler that supports x86 inline asm with this syntax, but not when writing portable code.

asm ("divl %[bn2dat_j]\n"
      : "=a" (q), "=d" (m) // results are in eax, edx registers
      : "d" (0),           // zero edx for us, please
        "a" (bn1->dat[i]), // "a" means EAX / RAX
        [bn2dat_j] "mr" (bn2->dat[j]) // register or memory, compiler chooses which is more efficient
      : // no register clobbers, and we don't read/write "memory" other than operands
    );

DIVL%4会工作过,但命名输入/当你添加更多的输入/输出限制,输出不会改变名称。

"divl %4" would have worked too, but named inputs/outputs don't change name when you add more input/output constraints.

这篇关于如何从内联汇编访问C结构/变量?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆