cuda 5.0动态并行错误:ptxas致命。未解析的extern函数'cudaLaunchDevice [英] cuda 5.0 dynamic parallelism error: ptxas fatal . unresolved extern function 'cudaLaunchDevice
问题描述
我在Linux上使用带有CUDA 5的计算能力35的tesla k20。对于一个简单的子内核调用,它给出一个编译错误:未解析的extern函数cudaLaunchDevice
I am using tesla k20 with compute capability 35 on Linux with CUDA 5.With a simple child kernel call it gives a compile error : Unresolved extern function cudaLaunchDevice
我的命令行如下:
nvcc --compile -G -O0 -g -gencode arch=compute_35 , code=sm_35 -x cu -o fill.cu fill.o
我看到 cudadevrt.a
在lib64 ..我们需要添加它或什么coukd做解决它?没有子内核调用一切都很好。
I see cudadevrt.a
in lib64.. Do we need to add it or what coukd be done to resolve it? Without child kernel call everything works fine.
推荐答案
必须显式编译已启用的可重定位设备代码,并链接设备运行时库使用动态并行性。因此,您的编译命令必须包括 - relocatable-device-code true
和链接命令(您未显示) code> -lcudadevrt 。
You must explicitly compile with relocatable device code enabled and link the device runtime library in order to use dynamic parallelism. So your compilation command must include --relocatable-device-code true
and the linking command (which you haven't shown us) should include -lcudadevrt
.
此程序在动态平行度编程指南pdf的TOOLKIT SUPPORT FOR DYNAMIC PARALLELISM一节中有详细描述,可从这里。
This procedure is described in detail in the "TOOLKIT SUPPORT FOR DYNAMIC PARALLELISM" section of the Dynamic Parallelism Programming Guide pdf, available here.
这篇关于cuda 5.0动态并行错误:ptxas致命。未解析的extern函数'cudaLaunchDevice的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!