使用IFORT与nvcc和CUSP的未解析引用 [英] Unresolved references using IFORT with nvcc and CUSP

查看:1034
本文介绍了使用IFORT与nvcc和CUSP的未解析引用的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个程序,我这样编译:

I have a program which I'm compiling like this:

(...) Some ifort *.f -c
nvcc -c src/bicgstab.cu -o bicgstab.o -I/home/ricardo/apps/cusp/cusplibrary
(...) Some more *.for -c
ifort *.o -L/usr/local/cuda-5.5/lib64 -lcudart -lcublas -lcusparse -lmkl_intel_lp64 -lmkl_intel_thread -lmkl_core -openmp -o program

一切正常,直到我添加了CUSP支持,我有这个包装器(bicgstab.cu):

Everything worked fine until i added the CUSP support where i have this wrapper (bicgstab.cu):

#include <cusp/csr_matrix.h>
#include <cusp/krylov/bicgstab.h>

#if defined(__cplusplus)
extern "C" {
#endif

void bicgstab_(int * device_I, int * device_J, float * device_V, float * device_x, float * device_b, int N, int NNZ){

    // *NOTE* raw pointers must be wrapped with thrust::device_ptr!
    thrust::device_ptr<int> wrapped_device_I(device_I);
    thrust::device_ptr<int> wrapped_device_J(device_J);
    thrust::device_ptr<float> wrapped_device_V(device_V);
    thrust::device_ptr<float> wrapped_device_x(device_x);
    thrust::device_ptr<float> wrapped_device_b(device_b);

    // use array1d_view to wrap the individual arrays
    typedef typename cusp::array1d_view< thrust::device_ptr<int> > DeviceIndexArrayView;
    typedef typename cusp::array1d_view< thrust::device_ptr<float> > DeviceValueArrayView;

    DeviceIndexArrayView row_indices (wrapped_device_I, wrapped_device_I + (N+1));
    DeviceIndexArrayView column_indices(wrapped_device_J, wrapped_device_J + NNZ);
    DeviceValueArrayView values (wrapped_device_V, wrapped_device_V + NNZ);
    DeviceValueArrayView x (wrapped_device_x, wrapped_device_x + N);
    DeviceValueArrayView b (wrapped_device_b, wrapped_device_b + N);

    // combine the three array1d_views into a csr_matrix_view
    typedef cusp::csr_matrix_view<DeviceIndexArrayView,
    DeviceIndexArrayView,
    DeviceValueArrayView> DeviceView;

    // construct a csr_matrix_view from the array1d_views
    DeviceView A(N, N, NNZ, row_indices, column_indices, values);

    // set stopping criteria:
    // iteration_limit = 100
    // relative_tolerance = 1e-5
    cusp::verbose_monitor<float> monitor(b, 100, 1e-5);

    // solve the linear system A * x = b with the Conjugate Gradient method
    cusp::krylov::bicgstab(A, x, b, monitor);

}

#if defined(__cplusplus)
}
#endif

Nvcc编译并生成对象,但是在最后一个命令中,当我将所有链接都链接在一起时,会出现:

Nvcc compiles and generate the object, but in the last command when i'm linking all together a bunch of errors because of the linking appears:

ipo: warning #11021: unresolved __gxx_personality_v0
        Referenced in bicgstab.o
ipo: warning #11021: unresolved _ZTVSt9exception
        Referenced in bicgstab.o
ipo: warning #11021: unresolved _ZTVSt9bad_alloc
        Referenced in bicgstab.o
ipo: warning #11021: unresolved _ZdlPv
        Referenced in bicgstab.o
ipo: warning #11021: unresolved __cxa_guard_acquire
        Referenced in bicgstab.o
ipo: warning #11021: unresolved _ZNSaIcEC1Ev
        Referenced in bicgstab.o
ipo: warning #11021: unresolved _ZNSsC1EPKcRKSaIcE
        Referenced in bicgstab.o
ipo: warning #11021: unresolved __cxa_guard_release
        Referenced in bicgstab.o
ipo: warning #11021: unresolved _ZNSsD1Ev
        Referenced in bicgstab.o
ipo: warning #11021: unresolved _ZNSaIcED1Ev
        Referenced in bicgstab.o
ipo: warning #11021: unresolved __cxa_guard_abort
        Referenced in bicgstab.o
ipo: warning #11021: unresolved _ZNSsC1ERKSs
        Referenced in bicgstab.o
ipo: warning #11021: unresolved _ZNSt13runtime_errorD2Ev
        Referenced in bicgstab.o
ipo: warning #11021: unresolved __cxa_call_unexpected
        Referenced in bicgstab.o
ipo: warning #11021: unresolved _ZNSt13runtime_errorC2ERKSs
        Referenced in bicgstab.o
ipo: warning #11021: unresolved _ZNSsC1Ev
        Referenced in bicgstab.o
ipo: warning #11021: unresolved _ZNKSs5emptyEv
        Referenced in bicgstab.o
ipo: warning #11021: unresolved _ZNKSt13runtime_error4whatEv
        Referenced in bicgstab.o
ipo: warning #11021: unresolved _ZNSsaSEPKc
        Referenced in bicgstab.o
ipo: warning #11021: unresolved _ZNSspLEPKc
        Referenced in bicgstab.o
ipo: warning #11021: unresolved _ZNSspLERKSs
        Referenced in bicgstab.o
ipo: warning #11021: unresolved __cxa_begin_catch
        Referenced in bicgstab.o
ipo: warning #11021: unresolved __cxa_end_catch
        Referenced in bicgstab.o
ipo: warning #11021: unresolved _ZNKSs5c_strEv
        Referenced in bicgstab.o
ipo: warning #11021: unresolved _ZNKSt9bad_alloc4whatEv
        Referenced in bicgstab.o
ipo: warning #11021: unresolved _ZNSt9bad_allocD2Ev
        Referenced in bicgstab.o
ipo: warning #11021: unresolved __cxa_allocate_exception
        Referenced in bicgstab.o
ipo: warning #11021: unresolved __cxa_free_exception
        Referenced in bicgstab.o
ipo: warning #11021: unresolved __cxa_throw
        Referenced in bicgstab.o
ipo: warning #11021: unresolved _ZNSt9exceptionD2Ev
        Referenced in bicgstab.o
ipo: warning #11021: unresolved _ZSt4cout
        Referenced in bicgstab.o
ipo: warning #11021: unresolved _ZStlsISt11char_traitsIcEERSt13basic_ostreamIcT_ES5_PKc
        Referenced in bicgstab.o
ipo: warning #11021: unresolved _ZNSolsEf
        Referenced in bicgstab.o
ipo: warning #11021: unresolved _ZNSolsEm
        Referenced in bicgstab.o
ipo: warning #11021: unresolved _ZSt4endlIcSt11char_traitsIcEERSt13basic_ostreamIT_T0_ES6_
        Referenced in bicgstab.o
ipo: warning #11021: unresolved _ZNSolsEPFRSoS_E
        Referenced in bicgstab.o
ipo: warning #11021: unresolved _ZSt9terminatev
        Referenced in bicgstab.o
ipo: warning #11021: unresolved _ZStlsIcSt11char_traitsIcEERSt13basic_ostreamIT_T0_ES6_St5_Setw
        Referenced in bicgstab.o
ipo: warning #11021: unresolved _ZNSolsEPFRSt8ios_baseS0_E
        Referenced in bicgstab.o
ipo: warning #11021: unresolved _ZNSt9bad_allocD1Ev
        Referenced in bicgstab.o
ipo: warning #11021: unresolved _ZTISt9bad_alloc
        Referenced in bicgstab.o
ipo: warning #11021: unresolved __cxa_pure_virtual
        Referenced in bicgstab.o
ipo: warning #11021: unresolved _ZTVN10__cxxabiv120__si_class_type_infoE
        Referenced in bicgstab.o
ipo: warning #11021: unresolved _ZTISt9exception
        Referenced in bicgstab.o
ipo: warning #11021: unresolved _ZTISt13runtime_error
        Referenced in bicgstab.o
ipo: warning #11021: unresolved _ZTVN10__cxxabiv117__class_type_infoE
        Referenced in bicgstab.o
ipo: warning #11021: unresolved _ZNSt8ios_base4InitC1Ev
        Referenced in bicgstab.o
ipo: warning #11021: unresolved _ZNSt8ios_base4InitD1Ev
        Referenced in bicgstab.o

相信它,因为ifort被添加或移除下划线,添加低/高情况下,或其他任何东西,因为该文件是在编译写,如果我产生我的计划外的二进制文件,只是为了测试,它的伟大工程。

I believe that its because ifort is adding or removing underscores, adding lower/upper cases or anything else because the file is compiling write and if i generate the binary outside my program, just for testing, it works great.

非常感谢。

推荐答案

当涉及多个文件时,ipo相当复杂。它实际上在链接时重新运行所有模块上的编译器。我不是一个专家,但这听起来像一个相当难以通过。

ipo is fairly complicated when there are multiple files involved. It's actually rerunning the compiler on all modules at link time. I'm not an expert on this, but that sounds like something fairly difficult to wade through.

一个可能的选择可能是,你试图将你的cuda代码编译成共享库(.so)和链接。它应该防止intel编译器工具链尝试重新编译和优化对nvcc / gcc生成的代码。我认为这将限制你的单文件优化。

One possible option might be that you try to compile your cuda code into a shared library (.so) and link against that. It should prevent the intel compiler toolchain from trying to recompile and optimize against the code generated by nvcc/gcc. I think this is going to limit you to "single file optimizations". Don't know if that will significantly affect your performance or not.

使用我的示例这里,我将修改编译命令如下:

Using my example here, I would modify the compile commands as follows:

$ nvcc -Xcompiler="-fPIC" -shared bicgstab.cu -o bicgstab.so -I/home-2/robertc/misc/cusp/cusplibrary-master
$ ifort -c -fast bic.f90
$ ifort bic.o bicgstab.so -L/shared/apps/cuda/CUDA-v6.0.37/lib64 -lcudart  -o program
ipo: remark #11001: performing single-file optimizations
ipo: remark #11006: generating object file /tmp/ipo_ifortxEdpin.o
$

过程中添加 -fast 开关。如果只是在 ifort 编译命令,我相信上面的方法将工作。如果你还想要/需要它在链接命令,那么似乎ifort想构建一个完全静态链接的可执行文件(并做模块间优化...),这将无法使用上述过程。

You don't indicate where in your compile process you are adding the -fast switch(es). If only on the ifort compile commands, I believe the above approach will work. If you also want/need it on the link command, then it appears that ifort wants to build an entirely statically linked executable (and do intermodule optimization...), which won't work using the above process.

这篇关于使用IFORT与nvcc和CUSP的未解析引用的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆