执行 Numba 生成的程序集 [英] Executing the assembly generated by Numba

查看:84
本文介绍了执行 Numba 生成的程序集的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在一个怪异的事件中,我陷入了以下困境,我正在使用以下Python代码将Numba生成的程序集写入文件:

  @jit(nopython = True,nogil = True)def six():返回6将open("six.asm","w")设为f:对于k,v在six.inspect_asm().items()中:f.写(v) 

汇编代码已成功写入文件,但我不知道如何执行它.我尝试了以下方法:

  $ as -o six.o six.asm$ ld six.o -o six.bin$ chmod + x six.bin$ ./six.bin 

但是,链接步骤失败,并显示以下内容:

  ld:警告:找不到条目符号_start;默认为00000000004000f0six.o:在函数`cpython :: __ main __ :: six $ 241'中:< string> :(.text + 0x20):对`PyArg_UnpackTuple'的未定义引用< string> :(.text + 0x47):对`PyEval_SaveThread'的未定义引用< string> :(.text + 0x53):对`PyEval_RestoreThread'的未定义引用< string> :(.text + 0x62):对`PyLong_FromLongLong'的未定义引用< string> :(.text + 0x74):对`PyExc_RuntimeError'的未定义引用< string> :(.text + 0x88):对`PyErr_SetString'的未定义引用 

我怀疑Numba和/或Python标准库需要针对生成的目标文件进行动态链接才能成功运行,但是我不确定如何完成(即使可以完成)首先).

我还尝试了以下方法,其中我将中间LLVM代码写入文件而不是程序集:

 ,其中open("six.ll","w")为f:对于k,v在six.inspect_llvm().items()中:f.写(v) 

然后

 <代码> $ lli six.ll 

但这也会失败,并出现以下错误:

在模块中找不到

 'main'函数. 

更新:

事实证明,存在一个实用程序来查找要传递给 ld 命令以动态链接Python标准库的相关标志.

 <代码> $ python3-config --ldflags 

返回

  -L/Users/rayan/anaconda3/lib/python3.7/config-3.7m-darwin -lpython3.7m -ldl -framework CoreFoundation 

这次使用正确的标志再次运行以下内容:

  $ as -o six.o six.asm$ ld six.o -o six.bin -L/Users/rayan/anaconda3/lib/python3.7/config-3.7m-darwin -lpython3.7m -ldl -framework CoreFoundation$ chmod + x six.bin$ ./six.bin 

我现在得到

  ld:警告:在命令行上未指定版本-分钟ld:入口点(_main)未定义.用于推断的架构x86_64 

我尝试在程序集文件中添加 _main 标签,但这似乎无济于事.关于如何定义入口点的任何想法?

更新2:

如果有用的话,这里是汇编代码,看来目标函数是带有标签 _ZN8__main__7six $ 241E 的代码:

  .text.file< string>".globl _ZN8__main__7six $ 241E.p2align 4,0x90.type _ZN8__main__7six $ 241E,@ function_ZN8__main__7six$241E:movq $ 6,(%rdi)xorl%eax,%eaxretq.lfunc_end0:.size _ZN8__main__7six $ 241E,.Lfunc_end0-_ZN8__main__7six $ 241E.globl _ZN7cpython8__main__7six $ 241E.p2align 4,0x90.type _ZN7cpython8__main__7six $ 241E,@ function_ZN7cpython8__main__7six $ 241E:.cfi_startprocpushq%rax.cfi_def_cfa_offset 16movq%rsi,%rdimovabsq $ .const.six,%rsimovabsq $ PyArg_UnpackTuple,%r8xorl%edx,%edxxorl%ecx,%ecxxorl%eax,%eaxcallq *%r8testl%eax,%eaxje .LBB1_3movabsq $ _ZN08NumbaEnv8__main__7six $ 241E,%raxcmpq $ 0,(%rax)je .LBB1_2movabsq $ PyEval_SaveThread,%raxcallq *%raxmovabsq $ PyEval_RestoreThread,%rcxmovq%rax,%rdicallq *%rcxmovabsq $ PyLong_FromLongLong,%raxmovl $ 6,%edipopq%rcx.cfi_def_cfa_offset 8jmpq *%rax.LBB1_2:.cfi_def_cfa_offset 16movabsq $ PyExc_RuntimeError,%rdimovabsq $.const.missing Environment",%rsimovabsq $ PyErr_SetString,%raxcallq *%rax.LBB1_3:xorl%eax,%eaxpopq%rcx.cfi_def_cfa_offset 8retq.lfunc_end1:.size _ZN7cpython8__main__7six $ 241E,.Lfunc_end1-_ZN7cpython8__main__7six $ 241E.cfi_endproc.globl cfunc._ZN8__main__7six $ 241E.p2align 4,0x90.type cfunc._ZN8__main__7six $ 241E,@ functioncfunc._ZN8__main__7six $ 241E:movl $ 6,%eaxretq.lfunc_end2:.size cfunc._ZN8__main__7six $ 241E,.Lfunc_end2-cfunc._ZN8__main__7six $ 241E.type _ZN08NumbaEnv8__main__7six $ 241E,@ object.comm _ZN08NumbaEnv8__main__7six $ 241E,8,8.type .const.six,@ object.section .rodata,"a",@ progbits.const.ix:.asciz六个".size .const.six,4.type".const.missing Environment",@ object.p2align 4.const.missing环境:.asciz缺少环境".size".const.missing Environment",20.section".note.GNU-stack",," @ progbits 

解决方案

浏览 [PyData.Numba]:Numba doc s,以及一些调试,试验和错误,我得出了一个结论:看来您正在远离您的追求之路(正如注释中也指出的那样).

Numba Python 代码(函数)转换为机器代码(显而易见的原因:速度).它可以即时执行所有操作(转换,构建,插入正在运行的进程),程序员只需要将函数修饰为 eg @ numba.jit ( [PyData.Numba]:刚入-时间编译).

您遇到的行为是正确的. Dispatcher 对象(通过装饰 six 函数使用)仅为函数本身(在那里没有 main )生成(汇编)代码.代码正在当前进程中执行( Python 解释程序的 main 函数).因此,链接器会抱怨没有 main 符号是很正常的.这就像编写一个仅包含以下内容的 C 文件:

  int six(){返回 6;} 

为了使事情正常运行,您必须:

  1. .asm 文件构建为 .o (目标)文件(完成)

  2. 将来自#1的 .o 文件包含在库中.

    • 静态
    • 动态


    该库将在(最终)可执行文件中链接.此步骤是可选步骤,因为您可以直接使用 .o 文件

  3. 构建另一个定义 main 的文件(并调用 six -我认为这是整个目的)到 .o 文件.由于我不太熟悉汇编程序,因此我用 C

    编写了它
  4. 将2个实体(来自#2.(#1.)和#3.)链接在一起

或者,您可以查看 [PyData.Numba]:提前编译代码,但请记住,它将生成一个 Python (扩展)模块.

回到当前问题.在 Ubuntu 18.04 64bit 上进行了测试.

code00.py :

 <代码>#!/usr/bin/env python导入系统导入数学进口numba@ numba.jit(nopython = True,nogil = True)def six():返回6def main(* argv):six()#调用函数,否则`inspect_asm()`将返回空dictspeed_funcs = [(六,numba.int32()),]对于func,在speed_funcs中为_:file_name_asm ="numba_ {0:s} _ {1:s} _ {2:03d} _ {3:02d} {4:02d} {5:02d} .asm" .format(func .__ name__,sys.平台,int(round(math.log2(sys.maxsize)))+ 1,* sys.version_info [:3])asm = func.inspect_asm()print(写入{0:s}:".format(file_name_asm))使用open(file_name_asm,"wb")作为fout:对于asm.items()中的k,v:print("{0:} .. format(k))fout.write(v.encode())如果__name__ =="__main__":print("Python {0:s} {1:d}在{2:s} \ n上的位..format("".join(item.strip()表示sys.version.split("; \ n)),如果sys.maxsize> 0x100000000,则为64,否则为sys.platform)32)main(* sys.argv [1:])打印("\ nDone".) 

main00.c :

  #include< stdio.h>#include< dlfcn.h>//#define SYMBOL_SIX"_ZN8__main__7six $ 241E";#define SYMBOL_SIX"cfunc._ZN8__main__7six $ 241E"typedef int(* SixFuncPtr)();int main(){void * pMod = dlopen("./libnumba_six_linux.so",RTLD_LAZY);如果(!pMod){printf(错误(%s)加载模块\ n",dlerror());返回-1;}SixFuncPtr pSixFunc = dlsym(pMod, SYMBOL_SIX);如果(!pSixFunc){printf(错误(%s)加载函数\ n",dlerror());dlclose(pMod);返回-2;}printf(返回的six():%d \ n",(* pSixFunc)());dlclose(pMod);返回0;} 

build.sh :

  CC = gccLIB_BASE_NAME = numba_six_linuxFLAG_LD_LIB_NUMBALINUX =-W1,-L".-Wl,-l $ {LIB_BASE_NAME}"FLAG_LD_LIB_PYTHON =-Wl,-L/usr/lib/python3.7/config-3.7m-x86_64-linux-gnu -Wl,-lpython3.7m"rm -f * .asm * .o * .a * .so * .exe回声生成.asmpython3 code00.py回声组装为-o $ {LIB_BASE_NAME}.o$ {LIB_BASE_NAME} _064_030705.asm回声链接库LIB_NUMBA =" ./lib $ {LIB_BASE_NAME} .so"#ar -scr $ {LIB_NUMBA} $ {LIB_BASE_NAME} .o$ {CC} -o $ {LIB_NUMBA}-共享$ {LIB_BASE_NAME} .o $ {FLAG_LD_LIB_PYTHON}回显转储库内容nm -S $ {LIB_NUMBA}#objdump -t $ {LIB_NUMBA}echo编译和链接可执行文件$ {CC} -o main00.exe main00.c -ldl回声退出脚本 

输出:

 (py_venv_pc064_03.07.05_test0)[cfati @ cfati-ubtu-18-064-00:〜/Work/Dev/StackOverflow/q061678226]>〜/sopr.sh***设置较短的提示以使其更好地适合于StackOverflow(或其他)页面中的***[064位提示][064位提示]lsbuild.sh code00.py main00.c[064位提示][064位提示]./build.sh产生.asmPython 3.7.5(默认,2019年11月7日,10:50:52)[GCC 8.3.0] Linux上的64位写入numba_six_linux_064_030705.asm:()完毕.集合链接库转储库内容0000000000201020 B __bss_start00000000000008b0 0000000000000006 T cfunc._ZN8__main__7six $ 241E0000000000201020 0000000000000001 b完成769800000000000008e0 0000000000000014 r .const.missing 环境00000000000008d0 0000000000000004 r .const.6w __cxa_finalize0000000000000730 t deregister_tm_clones00000000000007c0 t __do_global_dtors_aux0000000000200e58 t __do_global_dtors_aux_fini_array_entry0000000000201018 d __dso_handle0000000000200e60 d _动态0000000000201020 D _edata0000000000201030 B _结束00000000000008b8 T _fini0000000000000800 t frame_dummy0000000000200e50 t __frame_dummy_init_array_entry0000000000000990 r __FRAME_END__0000000000201000 d _GLOBAL_OFFSET_TABLE_w __gmon_start__00000000000008f4 r __GNU_EH_FRAME_HDR00000000000006f0 T _initw _ITM_deregisterTMCloneTablew _ITM_registerTMCloneTableU PyArg_UnpackTupleU PyErr_SetStringU PyEval_RestoreThreadU PyEval_SaveThreadU PyExc_RuntimeErrorU PyLong_FromLongLong0000000000000770 t register_tm_clones0000000000201020 d __TMC_END__0000000000201028 0000000000000008 B _ZN08NumbaEnv8__main__7six $ 241E0000000000000820 0000000000000086 T _ZN7cpython8__main__7six $ 241E0000000000000810 000000000000000a T _ZN8__main__7six $ 241E编译并链接可执行文件退出脚本[064位提示]【064位提示】>lsbuild.sh code00.py libnumba_six_linux.so main00.c main00.exe numba_six_linux_064_030705.asm numba_six_linux.o[064位提示][064位提示]#运行可执行文件[064位提示][064位提示]./main00.exesix()返回:6[064位提示] 

还发布(因为它很重要) numba_six_linux_064_030705.asm :

  .text.file "";.globl _ZN8__main__7six $ 241E.p2align 4,0x90.type _ZN8__main__7six $ 241E,@ function_ZN8__main__7six $ 241E:movq $ 6,(%rdi)xorl%eax,%eaxretq.lfunc_end0:.size _ZN8__main__7six $ 241E,.Lfunc_end0-_ZN8__main__7six $ 241E.globl _ZN7cpython8__main__7six $ 241E.p2align 4,0x90.type _ZN7cpython8__main__7six $ 241E,@ function_ZN7cpython8__main__7six $ 241E:.cfi_startprocpushq%rax.cfi_def_cfa_offset 16movq%rsi,%rdimovabsq $ .const.six,%rsimovabsq $ PyArg_UnpackTuple,%r8xorl%edx,%edxxorl%ecx,%ecxxorl%eax,%eaxcallq *%r8testl%eax,%eaxje .LBB1_3movabsq $ _ZN08NumbaEnv8__main__7six $ 241E,%raxcmpq $ 0,(%rax)je .LBB1_2movabsq $ PyEval_SaveThread,%raxcallq *%raxmovabsq $ PyEval_RestoreThread,%rcxmovq%rax,%rdicallq *%rcxmovabsq $ PyLong_FromLongLong,%raxmovl $ 6,%edipopq%rcx.cfi_def_cfa_offset 8jmpq *%rax.LBB1_2:.cfi_def_cfa_offset 16movabsq $ PyExc_RuntimeError,%rdimovabsq $.const.missing Environment",%rsimovabsq $ PyErr_SetString,%raxcallq *%rax.LBB1_3:xorl%eax,%eaxpopq%rcx.cfi_def_cfa_offset 8retq.lfunc_end1:.size _ZN7cpython8__main__7six $ 241E,.Lfunc_end1-_ZN7cpython8__main__7six $ 241E.cfi_endproc.globl cfunc._ZN8__main__7six $ 241E.p2align 4,0x90.type cfunc._ZN8__main__7six $ 241E,@ functioncfunc._ZN8__main__7six $ 241E:movl $ 6,%eaxretq.lfunc_end2:.size cfunc._ZN8__main__7six $ 241E,.Lfunc_end2-cfunc._ZN8__main__7six $ 241E.type _ZN08NumbaEnv8__main__7six $ 241E,@ object.comm _ZN08NumbaEnv8__main__7six $ 241E,8,8.type .const.six,@ object.rodata节,"a",@ progbits.const.ix:.asciz六个".size .const.six,4.type".const.missing Environment",@ object.p2align 4.const.missing 环境":.asciz缺少环境".size".const.missing Environment",20.section".note.GNU-stack",","@ progbits" 

注释:

  • numba_six_linux_064_030705.asm (及其衍生的所有内容)均包含 six 函数的代码.实际上,有很多符号(在 OSX 上,您也可以使用本机 otool -T ),例如:

    1. cfunc._ZN8__main__7six $ 241E -( C )函数本身

    2. _ZN7cpython8__main__7six $ 241E - Python 包装器:

      1. 执行 C < => Python 转换(通过 Python API 函数,例如 PyArg_UnpackTuple )
      2. 由于#1.它需要(取决于) libpython3.7m
      3. 因此, nopython = True 在这种情况下不起作用

    此外,这些符号中的 main 部分并不引用可执行入口点( main 函数),而是引用 Python 模块的顶级名称空间( __ main __ ).毕竟,该代码应该从 Python

    运行
  • 由于 C 普通函数在其中包含 dot ( . )名称,我无法直接从 C 调用它(因为它是无效的标识符名称),所以我不得不 load ( .so and)手动函数( dlopen / dlsym ),从而产生了比简单调用函数更多的代码.
    我没有尝试过,但是我认为对生成的 .asm 文件进行以下(手动)更改可以简化工作:

    • 将普通的 C 函数名称重命名(类似于 __ six 或其他也不会与其他冲突的有效 C 标识符(显式或内部)名称),然后再将其组装成 .asm 文件,这样可以直接从 C
    • 调用该函数
    • 删除 Python 包装器(#2.)也将摆脱#22.


更新#0

感谢@PeterCordes,他分享了确切的信息( [GNU.GCC]:我缺少的控制在汇编代码中使用的名称),这是一个更简单的版本.

main01.c :

  #include< stdio.h>extern int six()asm("cfunc._ZN8__main__7six $ 241E");int main(){printf(返回的six():%d \ n",six());} 

输出:

  [064位提示]>#从上一点+ main01.c恢复[064位提示][064位提示]lsbuild.sh code00.py libnumba_six_linux.so main00.c main00.exe main01.c numba_six_linux_064_030705.asm numba_six_linux.o[064位提示][064位提示]ar -scr libnumba_six_linux.a numba_six_linux.o[064位提示][064位提示]gcc -o main01.exe main01.c ./libnumba_six_linux.a -Wl,-L/usr/lib/python3.7/config-3.7m-x86_64-linux-gnu -Wl,-lpython3.7m[064位提示][064位提示]lsbuild.sh code00.py libnumba_six_linux.a libnumba_six_linux.so main00.c main00.exe main01.c main01.exe numba_six_linux_064_030705.asm numba_six_linux.o[064位提示][064位提示]./main01.exesix()返回:6[064位提示] 

In a bizarre turn of events, I've ended up in the following predicament where I'm using the following Python code to write the assembly generated by Numba to a file:

@jit(nopython=True, nogil=True)
def six():
    return 6

with open("six.asm", "w") as f:
    for k, v in six.inspect_asm().items():
        f.write(v)

The assembly code is successfully written to the file but I can't figure out how to execute it. I've tried the following:

$ as -o six.o six.asm
$ ld six.o -o six.bin
$ chmod +x six.bin
$ ./six.bin

However, the linking step fails with the following:

ld: warning: cannot find entry symbol _start; defaulting to 00000000004000f0
six.o: In function `cpython::__main__::six$241':
<string>:(.text+0x20): undefined reference to `PyArg_UnpackTuple'
<string>:(.text+0x47): undefined reference to `PyEval_SaveThread'
<string>:(.text+0x53): undefined reference to `PyEval_RestoreThread'
<string>:(.text+0x62): undefined reference to `PyLong_FromLongLong'
<string>:(.text+0x74): undefined reference to `PyExc_RuntimeError'
<string>:(.text+0x88): undefined reference to `PyErr_SetString'

I'm suspecting that the Numba and/or the Python standard library need to be dynamically linked against the generated object file for this to run successfully but I'm not sure how it can be done (if it can even be done in the first place).

I've also tried the following where I write the intermediate LLVM code to the file instead of the assembly:

with open("six.ll", "w") as f:
    for k, v in six.inspect_llvm().items():
        f.write(v)

And then

$ lli six.ll

But this fails as well with the following error:

'main' function not found in module.

UPDATE:

It turns out that there exists a utility to find the relevant flags to pass to the ld command to dynamically link the Python standard library.

$ python3-config --ldflags

Returns

-L/Users/rayan/anaconda3/lib/python3.7/config-3.7m-darwin -lpython3.7m -ldl -framework CoreFoundation 

Running the following again, this time with the correct flags:

$ as -o six.o six.asm
$ ld six.o -o six.bin -L/Users/rayan/anaconda3/lib/python3.7/config-3.7m-darwin -lpython3.7m -ldl -framework CoreFoundation 
$ chmod +x six.bin
$ ./six.bin

I am now getting

ld: warning: No version-min specified on command line
ld: entry point (_main) undefined. for inferred architecture x86_64

I have tried adding a _main label in the assembly file but that doesn't seem to do anything. Any ideas on how to define the entry point?

UPDATE 2:

Here's the assembly code in case that's useful, it seems like the target function is the one with label _ZN8__main__7six$241E:

    .text
    .file   "<string>"
    .globl  _ZN8__main__7six$241E
    .p2align    4, 0x90
    .type   _ZN8__main__7six$241E,@function
_ZN8__main__7six$241E:
    movq    $6, (%rdi)
    xorl    %eax, %eax
    retq
.Lfunc_end0:
    .size   _ZN8__main__7six$241E, .Lfunc_end0-_ZN8__main__7six$241E

    .globl  _ZN7cpython8__main__7six$241E
    .p2align    4, 0x90
    .type   _ZN7cpython8__main__7six$241E,@function
_ZN7cpython8__main__7six$241E:
    .cfi_startproc
    pushq   %rax
    .cfi_def_cfa_offset 16
    movq    %rsi, %rdi
    movabsq $.const.six, %rsi
    movabsq $PyArg_UnpackTuple, %r8
    xorl    %edx, %edx
    xorl    %ecx, %ecx
    xorl    %eax, %eax
    callq   *%r8
    testl   %eax, %eax
    je  .LBB1_3
    movabsq $_ZN08NumbaEnv8__main__7six$241E, %rax
    cmpq    $0, (%rax)
    je  .LBB1_2
    movabsq $PyEval_SaveThread, %rax
    callq   *%rax
    movabsq $PyEval_RestoreThread, %rcx
    movq    %rax, %rdi
    callq   *%rcx
    movabsq $PyLong_FromLongLong, %rax
    movl    $6, %edi
    popq    %rcx
    .cfi_def_cfa_offset 8
    jmpq    *%rax
.LBB1_2:
    .cfi_def_cfa_offset 16
    movabsq $PyExc_RuntimeError, %rdi
    movabsq $".const.missing Environment", %rsi
    movabsq $PyErr_SetString, %rax
    callq   *%rax
.LBB1_3:
    xorl    %eax, %eax
    popq    %rcx
    .cfi_def_cfa_offset 8
    retq
.Lfunc_end1:
    .size   _ZN7cpython8__main__7six$241E, .Lfunc_end1-_ZN7cpython8__main__7six$241E
    .cfi_endproc

    .globl  cfunc._ZN8__main__7six$241E
    .p2align    4, 0x90
    .type   cfunc._ZN8__main__7six$241E,@function
cfunc._ZN8__main__7six$241E:
    movl    $6, %eax
    retq
.Lfunc_end2:
    .size   cfunc._ZN8__main__7six$241E, .Lfunc_end2-cfunc._ZN8__main__7six$241E

    .type   _ZN08NumbaEnv8__main__7six$241E,@object
    .comm   _ZN08NumbaEnv8__main__7six$241E,8,8
    .type   .const.six,@object
    .section    .rodata,"a",@progbits
.const.six:
    .asciz  "six"
    .size   .const.six, 4

    .type   ".const.missing Environment",@object
    .p2align    4
.const.missing Environment:
    .asciz  "missing Environment"
    .size   ".const.missing Environment", 20


    .section    ".note.GNU-stack","",@progbits

解决方案

After browsing [PyData.Numba]: Numba docs, and some debugging, trial and error, I reached to a conclusion: it seems you're off the path to your quest (as was also pointed out in comments).

Numba converts Python code (functions) to machine code (for the obvious reason: speed). It does everything (convert, build, insert in the running process) on the fly, the programmer only needs to decorate the function as e.g. @numba.jit ([PyData.Numba]: Just-in-Time compilation).

The behavior that you're experiencing is correct. The Dispatcher object (used by decorating the six function) only generates (assembly) code for the function itself (it's no main there, as the code is executing in the current process (Python interpreter's main function)). So, it's normal for the linker to complain there's no main symbol. It's like writing a C file that only contains:

int six() {
    return 6;
}

In order for things to work properly, you have to:

  1. Build the .asm file into an .o (object) file (done)

  2. Include the .o file from #1. into a library which can be

    • Static
    • Dynamic


    The library is to be linked in the (final) executable. This step is optional as you could use the .o file directly

  3. Build another file that defines main (and calls six - which I assume it's the whole purpose) into an .o file. As I'm not very comfortable with assembly, I wrote it in C

  4. Link the 2 entities (from #2. (#1.) and #3.) together

As an alternative, you could take a look at [PyData.Numba]: Compiling code ahead of time, but bear in mind that it would generate a Python (extension) module.

Back to the current problem. Did the test on Ubuntu 18.04 64bit.

code00.py:

#!/usr/bin/env python

import sys
import math
import numba


@numba.jit(nopython=True, nogil=True)
def six():
    return 6


def main(*argv):
    six()  # Call the function(s), otherwise `inspect_asm()` would return empty dict
    speed_funcs = [
        (six, numba.int32()),
    ]
    for func, _ in speed_funcs:
        file_name_asm = "numba_{0:s}_{1:s}_{2:03d}_{3:02d}{4:02d}{5:02d}.asm".format(func.__name__, sys.platform, int(round(math.log2(sys.maxsize))) + 1, *sys.version_info[:3])
        asm = func.inspect_asm()
        print("Writing to {0:s}:".format(file_name_asm))
        with open(file_name_asm, "wb") as fout:
            for k, v in asm.items():
                print("    {0:}".format(k))
                fout.write(v.encode())


if __name__ == "__main__":
    print("Python {0:s} {1:d}bit on {2:s}\n".format(" ".join(item.strip() for item in sys.version.split("\n")), 64 if sys.maxsize > 0x100000000 else 32, sys.platform))
    main(*sys.argv[1:])
    print("\nDone.")

main00.c:

#include <stdio.h>
#include <dlfcn.h>

//#define SYMBOL_SIX "_ZN8__main__7six$241E"
#define SYMBOL_SIX "cfunc._ZN8__main__7six$241E"

typedef int (*SixFuncPtr)();

int main() {
    void *pMod = dlopen("./libnumba_six_linux.so", RTLD_LAZY);
    if (!pMod) {
        printf("Error (%s) loading module\n", dlerror());
        return -1;
    }
    SixFuncPtr pSixFunc = dlsym(pMod, SYMBOL_SIX);
    if (!pSixFunc)
    {
        printf("Error (%s) loading function\n", dlerror());
        dlclose(pMod);
         return -2;
    }
    printf("six() returned: %d\n", (*pSixFunc)());
    dlclose(pMod);
    return 0;
}

build.sh:

CC=gcc

LIB_BASE_NAME=numba_six_linux

FLAG_LD_LIB_NUMBALINUX="-Wl,-L. -Wl,-l${LIB_BASE_NAME}"
FLAG_LD_LIB_PYTHON="-Wl,-L/usr/lib/python3.7/config-3.7m-x86_64-linux-gnu -Wl,-lpython3.7m"

rm -f *.asm *.o *.a *.so *.exe

echo Generate .asm
python3 code00.py

echo Assemble
as -o ${LIB_BASE_NAME}.o ${LIB_BASE_NAME}_064_030705.asm

echo Link library
LIB_NUMBA="./lib${LIB_BASE_NAME}.so"
#ar -scr ${LIB_NUMBA} ${LIB_BASE_NAME}.o
${CC} -o ${LIB_NUMBA} -shared ${LIB_BASE_NAME}.o ${FLAG_LD_LIB_PYTHON}

echo Dump library contents
nm -S ${LIB_NUMBA}
#objdump -t ${LIB_NUMBA}

echo Compile and link executable
${CC} -o main00.exe main00.c -ldl

echo Exit script

Output:

(py_venv_pc064_03.07.05_test0) [cfati@cfati-ubtu-18-064-00:~/Work/Dev/StackOverflow/q061678226]> ~/sopr.sh
*** Set shorter prompt to better fit when pasted in StackOverflow (or other) pages ***

[064bit prompt]>
[064bit prompt]> ls
build.sh  code00.py  main00.c
[064bit prompt]>
[064bit prompt]> ./build.sh
Generate .asm
Python 3.7.5 (default, Nov  7 2019, 10:50:52) [GCC 8.3.0] 64bit on linux

Writing to numba_six_linux_064_030705.asm:
    ()

Done.
Assemble
Link library
Dump library contents
0000000000201020 B __bss_start
00000000000008b0 0000000000000006 T cfunc._ZN8__main__7six$241E
0000000000201020 0000000000000001 b completed.7698
00000000000008e0 0000000000000014 r .const.missing Environment
00000000000008d0 0000000000000004 r .const.six
                 w __cxa_finalize
0000000000000730 t deregister_tm_clones
00000000000007c0 t __do_global_dtors_aux
0000000000200e58 t __do_global_dtors_aux_fini_array_entry
0000000000201018 d __dso_handle
0000000000200e60 d _DYNAMIC
0000000000201020 D _edata
0000000000201030 B _end
00000000000008b8 T _fini
0000000000000800 t frame_dummy
0000000000200e50 t __frame_dummy_init_array_entry
0000000000000990 r __FRAME_END__
0000000000201000 d _GLOBAL_OFFSET_TABLE_
                 w __gmon_start__
00000000000008f4 r __GNU_EH_FRAME_HDR
00000000000006f0 T _init
                 w _ITM_deregisterTMCloneTable
                 w _ITM_registerTMCloneTable
                 U PyArg_UnpackTuple
                 U PyErr_SetString
                 U PyEval_RestoreThread
                 U PyEval_SaveThread
                 U PyExc_RuntimeError
                 U PyLong_FromLongLong
0000000000000770 t register_tm_clones
0000000000201020 d __TMC_END__
0000000000201028 0000000000000008 B _ZN08NumbaEnv8__main__7six$241E
0000000000000820 0000000000000086 T _ZN7cpython8__main__7six$241E
0000000000000810 000000000000000a T _ZN8__main__7six$241E
Compile and link executable
Exit script
[064bit prompt]>
[064bit prompt]> ls
build.sh  code00.py  libnumba_six_linux.so  main00.c  main00.exe  numba_six_linux_064_030705.asm  numba_six_linux.o
[064bit prompt]>
[064bit prompt]> # Run the executable
[064bit prompt]>
[064bit prompt]> ./main00.exe
six() returned: 6
[064bit prompt]>

Also posting (since it's important) numba_six_linux_064_030705.asm:

    .text
    .file   "<string>"
    .globl  _ZN8__main__7six$241E
    .p2align    4, 0x90
    .type   _ZN8__main__7six$241E,@function
_ZN8__main__7six$241E:
    movq    $6, (%rdi)
    xorl    %eax, %eax
    retq
.Lfunc_end0:
    .size   _ZN8__main__7six$241E, .Lfunc_end0-_ZN8__main__7six$241E

    .globl  _ZN7cpython8__main__7six$241E
    .p2align    4, 0x90
    .type   _ZN7cpython8__main__7six$241E,@function
_ZN7cpython8__main__7six$241E:
    .cfi_startproc
    pushq   %rax
    .cfi_def_cfa_offset 16
    movq    %rsi, %rdi
    movabsq $.const.six, %rsi
    movabsq $PyArg_UnpackTuple, %r8
    xorl    %edx, %edx
    xorl    %ecx, %ecx
    xorl    %eax, %eax
    callq   *%r8
    testl   %eax, %eax
    je  .LBB1_3
    movabsq $_ZN08NumbaEnv8__main__7six$241E, %rax
    cmpq    $0, (%rax)
    je  .LBB1_2
    movabsq $PyEval_SaveThread, %rax
    callq   *%rax
    movabsq $PyEval_RestoreThread, %rcx
    movq    %rax, %rdi
    callq   *%rcx
    movabsq $PyLong_FromLongLong, %rax
    movl    $6, %edi
    popq    %rcx
    .cfi_def_cfa_offset 8
    jmpq    *%rax
.LBB1_2:
    .cfi_def_cfa_offset 16
    movabsq $PyExc_RuntimeError, %rdi
    movabsq $".const.missing Environment", %rsi
    movabsq $PyErr_SetString, %rax
    callq   *%rax
.LBB1_3:
    xorl    %eax, %eax
    popq    %rcx
    .cfi_def_cfa_offset 8
    retq
.Lfunc_end1:
    .size   _ZN7cpython8__main__7six$241E, .Lfunc_end1-_ZN7cpython8__main__7six$241E
    .cfi_endproc

    .globl  cfunc._ZN8__main__7six$241E
    .p2align    4, 0x90
    .type   cfunc._ZN8__main__7six$241E,@function
cfunc._ZN8__main__7six$241E:
    movl    $6, %eax
    retq
.Lfunc_end2:
    .size   cfunc._ZN8__main__7six$241E, .Lfunc_end2-cfunc._ZN8__main__7six$241E

    .type   _ZN08NumbaEnv8__main__7six$241E,@object
    .comm   _ZN08NumbaEnv8__main__7six$241E,8,8
    .type   .const.six,@object
    .section    .rodata,"a",@progbits
.const.six:
    .asciz  "six"
    .size   .const.six, 4

    .type   ".const.missing Environment",@object
    .p2align    4
".const.missing Environment":
    .asciz  "missing Environment"
    .size   ".const.missing Environment", 20


    .section    ".note.GNU-stack","",@progbits

Notes:

  • numba_six_linux_064_030705.asm (and everything that derives from it) contain the code for the six function. Actually, there are a bunch of symbols (on OSX, you can also use the native otool -T) like:

    1. cfunc._ZN8__main__7six$241E - the (C) function itself

    2. _ZN7cpython8__main__7six$241E - the Python wrapper:

      1. Performs the C <=> Python conversions (via Python API functions like PyArg_UnpackTuple)
      2. Due to #1. it needs (depends on) libpython3.7m
      3. As a consequence, nopython=True has no effect in this case

    Also, the main part from these symbols doesn't refer to an executable entry point (main function), but to a Python module's top level namespace (__main__). After all, this code is supposed to be run from Python

  • Due to the fact that the C plain function contains a dot (.) in the name, I couldn't call it directly from C (as it's an invalid identifier name), so I had to load (the .so and) the function manually (dlopen / dlsym), resulting in more code than simply calling the function.
    I didn't try it, but I think it would make sense that the following (manual) changes to the generated .asm file would simplify the work:

    • Renaming the plain C function name (to something like __six, or any other valid C identifier that also doesn't clash with another (explicit or internal) name) in the .asm file before assembling it, would make the function directly callable from C
    • Removing the Python wrapper (#2.) would also get rid of #22.


Update #0

Thanks to @PeterCordes, who shared that exact piece of info ([GNU.GCC]: Controlling Names Used in Assembler Code) that I was missing, here's a much simpler version.

main01.c:

#include <stdio.h>

extern int six() asm ("cfunc._ZN8__main__7six$241E");

int main() {
    printf("six() returned: %d\n", six());
}

Output:

[064bit prompt]> # Resume from previous point + main01.c
[064bit prompt]>
[064bit prompt]> ls
build.sh  code00.py  libnumba_six_linux.so  main00.c  main00.exe  main01.c  numba_six_linux_064_030705.asm  numba_six_linux.o
[064bit prompt]>
[064bit prompt]> ar -scr libnumba_six_linux.a numba_six_linux.o
[064bit prompt]>
[064bit prompt]> gcc -o main01.exe main01.c ./libnumba_six_linux.a -Wl,-L/usr/lib/python3.7/config-3.7m-x86_64-linux-gnu -Wl,-lpython3.7m
[064bit prompt]>
[064bit prompt]> ls
build.sh  code00.py  libnumba_six_linux.a  libnumba_six_linux.so  main00.c  main00.exe  main01.c  main01.exe  numba_six_linux_064_030705.asm  numba_six_linux.o
[064bit prompt]>
[064bit prompt]> ./main01.exe
six() returned: 6
[064bit prompt]>

这篇关于执行 Numba 生成的程序集的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆