通过共享库连接 Python 和 Torch7(Lua) [英] interfacing Python and Torch7(Lua) via shared library

查看:35
本文介绍了通过共享库连接 Python 和 Torch7(Lua)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试在 python 和 lua 之间传递数据(数组),我想使用 Torch7 框架在 lua 中操作数据.我认为这最好通过 C 来完成,因为 python 和 lua 与 C 接口.另外一些优点是不需要数据复制(只传递指针)并且速度很快.

I am trying to pass data (arrays) between python and lua and I want to manipulate the data in lua using the Torch7 framework. I figured this can best be done through C, since python and lua interface with C. Also some advantages are that no data copying is needed this way (passing only pointers) and is fast.

我实现了两个程序,一个是 lua 嵌入在 c 中,另一个是 python 将数据传递给 c.它们在编译为可执行二进制文件时都可以工作.然而,当 c to lua 程序改为共享库时,事情就不起作用了.

I implemented two programs, one where lua is embedded in c and one where python passes data to c. They both work when compiled to executable binaries. However when the c to lua program is instead made to be a shared library things don’t work.

详情:我使用的是 64 位 ubuntu 14.04 和 12.04.我正在使用 luajit 2.0.2 和 lua 5.1 安装在/usr/local/依赖库在/usr/local/lib 中,头文件在/usr/local/include 中我使用的是 python 2.7

The details: I’m using 64-bit ubuntu 14.04 and 12.04. I’m using luajit 2.0.2 with lua 5.1 installed in /usr/local/ Dependency libs are in /usr/local/lib and headers are in /usr/local/include I’m using python 2.7

c to lua程序的代码是:

The code for the c to lua program is:

tensor.lua

require 'torch'

function hi_tensor(t)
   print(‘Hi from lua')
   torch.setdefaulttensortype('torch.FloatTensor')
   print(t)
return t*2
end

cluaf.h

void multiply (float* array, int m, int n, float *result, int m1, int n1);

cluaf.c

#include <stdio.h>
#include <string.h>
#include "lua.h"
#include "lauxlib.h"
#include "lualib.h"
#include "luaT.h"
#include "TH/TH.h"

void multiply (float* array, int m, int n, float *result, int m1, int n1)
{
    lua_State *L = luaL_newstate();
    luaL_openlibs( L );

    // loading the lua file
    if (luaL_loadfile(L, "tensor.lua") || lua_pcall(L, 0, 0, 0))
    {
        printf("error: %s 
", lua_tostring(L, -1));
    }

    // convert the c array to Torch7 specific structure representing a tensor
    THFloatStorage* storage =  THFloatStorage_newWithData(array, m*n);
    THFloatTensor* tensor = THFloatTensor_newWithStorage2d(storage, 0, m, n, n, 1);
    luaT_newmetatable(L, "torch.FloatTensor", NULL, NULL, NULL, NULL);

    // load the lua function hi_tensor
    lua_getglobal(L, "hi_tensor");
    if(!lua_isfunction(L,-1))
    {
        lua_pop(L,1);
    }

    //this pushes data to the stack to be used as a parameter
    //to the hi_tensor function call
    luaT_pushudata(L, (void *)tensor, "torch.FloatTensor");

    // call the lua function hi_tensor
    if (lua_pcall(L, 1, 1, 0) != 0)
    {
        printf("error running function `hi_tensor': %s 
", lua_tostring(L, -1));
    }

    // get results returned from the lua function hi_tensor
    THFloatTensor* z = luaT_toudata(L, -1, "torch.FloatTensor");
    lua_pop(L, 1);
    THFloatStorage *storage_res =  z->storage;
    result = storage_res->data;

    return ;
}

然后我做的测试:

luajit -b tensor.lua tensor.o

gcc -w -c -Wall -Wl,-E -fpic cluaf.c -lluajit -lluaT -lTH -lm -ldl -L /usr/local/lib

gcc -shared cluaf.o tensor.o -L/usr/local/lib -lluajit -lluaT -lTH -lm -ldl -Wl,-E -o libcluaf.so

gcc -L. -Wall -o test main.c -lcluaf

./test

输出:

Hi from lua
 1.0000  0.2000
 0.2000  5.3000
[torch.FloatTensor of dimension 2x2]

c result 2.000000 
c result 0.400000 
c result 0.400000 
c result 10.60000

到目前为止一切顺利.但是当我尝试在 python 中使用共享库时,它会中断.

So far so good. But when I try to use the shared library in python it breaks.

测试.py

from ctypes import byref, cdll, c_int
import ctypes
import numpy as np
import cython

l = cdll.LoadLibrary(‘absolute_path_to_so/libcluaf.so')

a = np.arange(4, dtype=np.float64).reshape((2,2))
b = np.arange(4, dtype=np.float64).reshape((2,2))

l.multiply.argtypes = [ctypes.POINTER(ctypes.c_float), ctypes.c_int, ctypes.c_int,     ctypes.POINTER(ctypes.c_float), ctypes.c_int, ctypes.c_int]
a_list = []
b_list = []

for i in range(a.shape[0]):
    for j in range(a.shape[1]):
            a_list.append(a[i][j])

for i in range(b.shape[0]):
     for j in range(b.shape[1]):
        b_list.append(b[i][j])

arr_a = (ctypes.c_float * len(a_list))()
arr_b = (ctypes.c_float * len(b_list))()

l.multiply(arr_a, ctypes.c_int(2), ctypes.c_int(2), arr_b, ctypes.c_int(2), ctypes.c_int(2))

我跑:

python test.py

输出为:

error: error loading module 'libpaths' from file '/usr/local/lib/lua/5.1/libpaths.so':
    /usr/local/lib/lua/5.1/libpaths.so: undefined symbol: lua_gettop

我在这里和网络上的任何地方都搜索了此错误,但他们建议 (1) 包含 -Wl,-E 以导出符号或 (2) 添加对我所做的链接的依赖关系.(1) 我有 -Wl,-E 但它似乎没有做任何事情.(2) 我已经包含了依赖项 (-L/usr/local/lib -lluajit -lluaT -lTH -lm -ldl)

I searched for this error here and everywhere on the web but they either suggest (1) to include -Wl,-E to export symbols or (2) to add dependencies on linking which I did. (1) I have -Wl,-E but it seems to not be doing anything. (2) I have included the dependencies (-L/usr/local/lib -lluajit -lluaT -lTH -lm -ldl)

python 测试失败不是在导入共享库时而是在调用 lua 中的require torch"时.这也是这个案例与我发现的其他案例不同的地方.

The python test fails not when the shared library is imported but when the ‘require torch’ inside lua is called. That is also the different thing in this case from the other cases I found.

luajit.so 定义了符号 lua_gettop (nm/usr/local/lib/luajit.so 可以看到)lua.h 定义了 LUA_API int (lua_gettop) (lua_State *L);

luajit.so defines the symbol lua_gettop (nm /usr/local/lib/luajit.so to see that) lua.h defines LUA_API int (lua_gettop) (lua_State *L);

我猜在将 c 编译为二进制时一切正常,因为它会在 lua.h 中找到所有符号但是使用共享库它不会从 luajit.so 中选择 lua_gettop(我不知道为什么).

I guess when compiling c to binary all works because it finds all symbols in lua.h but using the shared library it doesn’t pick lua_gettop from luajit.so (I don’t know why).

www.luajit.org/running.html 说:'在大多数基于 ELF 的系统(例如 Linux)上,您需要在链接应用程序时显式导出全局符号,例如与:-Wl,-Erequire() 尝试从导出的​​符号(在 Windows 上的 *.exe 或 lua51.dll 中)和从 package.cpath 中的共享库加载嵌入的字节码数据.'

www.luajit.org/running.html says: 'On most ELF-based systems (e.g. Linux) you need to explicitly export the global symbols when linking your application, e.g. with: -Wl,-E require() tries to load embedded bytecode data from exported symbols (in *.exe or lua51.dll on Windows) and from shared libraries in package.cpath.'

package.cpath 和 package.path 是:

package.cpath and package.path are:

./?.so;/usr/local/lib/lua/5.1/?.so;/usr/local/lib/lua/5.1/loadall.so

./?.lua;/usr/local/share/luajit-2.0.2/?.lua;/usr/local/share/lua/5.1/?.lua;/usr/local/share/lua/5.1/?/init.lua

这是 nm libcluaf.so 返回的内容:

Here is what nm libcluaf.so returns:

00000000002020a0 B __bss_start
00000000002020a0 b completed.6972
                 w __cxa_finalize@@GLIBC_2.2.5
0000000000000a50 t deregister_tm_clones
0000000000000ac0 t __do_global_dtors_aux
0000000000201dd8 t __do_global_dtors_aux_fini_array_entry
0000000000202098 d __dso_handle
0000000000201de8 d _DYNAMIC
00000000002020a0 D _edata
00000000002020a8 B _end
0000000000000d28 T _fini
0000000000000b00 t frame_dummy
0000000000201dd0 t __frame_dummy_init_array_entry
0000000000000ed0 r __FRAME_END__
0000000000202000 d _GLOBAL_OFFSET_TABLE_
                 w __gmon_start__
0000000000000918 T _init
                 w _ITM_deregisterTMCloneTable
                 w _ITM_registerTMCloneTable
0000000000201de0 d __JCR_END__
0000000000201de0 d __JCR_LIST__
                 w _Jv_RegisterClasses
                 U lua_getfield
0000000000000d99 R luaJIT_BC_tensor
                 U luaL_loadfile
                 U luaL_newstate
                 U luaL_openlibs
                 U lua_pcall
                 U lua_settop
                 U luaT_newmetatable
                 U lua_tolstring
                 U luaT_pushudata
                 U luaT_toudata
                 U lua_type
0000000000000b35 T multiply
                 U printf@@GLIBC_2.2.5
0000000000000a80 t register_tm_clones
                 U THFloatStorage_newWithData
                 U THFloatTensor_newWithStorage2d
00000000002020a0 d __TMC_END__

提前致谢

推荐答案

在 Linux 上,Lua 模块不直接链接到 Lua 库,而是期望找到已加载的 Lua API 函数.这通常是通过使用 -Wl,-E 链接器标志从解释器导出它们来完成的.此标志仅适用于 executables 中的符号,不适用于共享库.对于共享库,存在类似的东西:dlopen 函数的 RTLD_GLOBAL 标志.默认情况下,编译器命令行中列出的所有共享库都使用 RTLD_LOCAL 加载,但幸运的是 Linux 重用了已打开的库句柄.所以你可以:

On Linux Lua modules don't link to the Lua library directly but instead expect to find the Lua API functions already loaded. This is usually done by exporting them from the interpreter using the -Wl,-E linker flag. This flag only works for symbols in executables, not shared libraries. For shared libraries there exists something similar: the RTLD_GLOBAL flag for the dlopen function. By default all shared libraries listed on the compiler command line are loaded using RTLD_LOCAL instead, but fortunately Linux reuses already opened library handles. So you can either:

使用RTLD_GLOBAL预加载Lua(JIT)库之前它被自动加载(当你加载libcluaf.so时发生):

Preload the Lua(JIT) library using RTLD_GLOBAL before it gets loaded automatically (which happens when you load libcluaf.so):

from ctypes import byref, cdll, c_int
import ctypes

lualib = ctypes.CDLL("libluajit-5.1.so", mode=ctypes.RTLD_GLOBAL)
l = cdll.LoadLibrary('absolute_path_to_so/libcluaf.so')
# ...

或者之后使用 dlopenRTLD_NOLOAD 标志更改 Lua(JIT) 库句柄的标志.不过,该标志不在 POSIX 中,您可能必须使用 C 才能这样做.见例如这里.

Or change the flags of the Lua(JIT) library handle afterwards using the RTLD_NOLOAD flag for dlopen. This flag is not in POSIX though, and you probably have to use C to do so. See e.g. here.

这篇关于通过共享库连接 Python 和 Torch7(Lua)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆