cudaGetExportTable（CUDA运行时库）中抛出的异常'cudaError_enum'？ [英] Exception 'cudaError_enum' thrown in cudaGetExportTable (CUDA runtime library)?

查看：915 发布时间：2017/3/4 15:51:15 exception cuda mpi icc

本文介绍了cudaGetExportTable（CUDA运行时库）中抛出的异常'cudaError_enum'？的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我使用DDT调试基于MPI的CUDA程序。当从 cudaMalloc cudaGetExportTable 中抛出异常时， c>和 cudaThreadSynchronize （UPDATED：使用 cudaDeviceSynchronize 给出相同的错误）。

为什么libcudart抛出一个异常（我使用的是C API，而不是C ++ API），然后我可以通过它的 cudaError_t 返回值或 CHECKCUDAERROR ？

（我使用CUDA 4.2 SDK for Linux） p>

输出：

 过程9： cudaError_enum'
过程9：终止调用
 
过程20：抛出'cudaError'的实例后调用终止
过程20：递归调用

我的代码：

  cudaThreadSynchronize （）; 
 CHECKCUDAERROR（cudaThreadSynchronize（））;

其他代码片段：

  const size_t t; //从参数到函数
 void * p = NULL; 
 const cudaError_t r = cudaMalloc（& p，t）; 
 if（r！= cudaSuccess）{
 ERROR（cudaMalloc failed。）; 
}

部分回溯：

 过程9：
 cudaDeviceSynchronize（）
  - > cudaGetExportTable（）
  - > __cxa_throw 
 
过程20：
 cudaMalloc（）
  - > cudaGetExportTable（）
  - > cudaGetExportTable（）
  - > __cxa_throw

内存调试错误：

 进程0,2,4,6,9,15-17,20-21：
在Malloc_cuda_gx（cudamalloc.cu:35）中检测到内存错误：
 dmalloc bad管理结构列表。

此行是上面显示的cudaMalloc代码片段。另外：

 过程1,3,5,10-11,13-14,18-19,23：
在vfprintf中检测到来自/lib64/libc.so.6的内存错误：
 dmalloc bad admin结构列表。

此外，当在每个节点上运行3个核心/ gpus而不是每个节点4个gpus时，dmalloc检测到类似内存错误，但是当没有在调试模式下，代码运行完全正常每个节点3 gpus（就我所知）。

解决方案

使用gcc重新编译。（我使用icc编译我的代码。）

 
 
 当这样做时，调试时出现异常，但继续经过它，我得到真正的CUDA错误：过程9：gadget_cuda_gx.cu:116：gadget_cuda_gx.cu:919中的错误：CUDA错误：cudaThreadSynchronize（）：未指定的启动失败$（$）
 
 
  b $ b进程20：cudamalloc.cu:38：错误所有支持CUDA的设备正忙或不可用，cudaMalloc无法分配856792字节= 0.817101 Mb 
  
 Valgrind显示我的代码没有内存损坏或泄漏（使用gcc或icc编译），但在libcudart中发现了一些泄漏。
 
 
 更新：仍未修复。看起来是在回答＃2中向此主题报告的相同问题： cudaMemset在__device__变量上失败< a>。运行时不工作，因为它应该，看起来... 
 
I am debugging a MPI-based CUDA program with DDT.  My code aborts when the CUDA runtime library (libcudart) throws an exception in the (undocumented) function cudaGetExportTable, when called from cudaMalloc and cudaThreadSynchronize (UPDATED: using cudaDeviceSynchronize gives the same error) in my code.

Why is libcudart throwing an exception (I am using the C API, not the C++ API) before I can detect it in my code with its cudaError_t return value or with CHECKCUDAERROR?

(I'm using CUDA 4.2 SDK for Linux.)

Output:
Process 9: terminate called after throwing an instance of 'cudaError_enum'
Process 9: terminate called recursively

Process 20: terminate called after throwing an instance of 'cudaError'
Process 20: terminate called recursively
My code:
cudaThreadSynchronize();
CHECKCUDAERROR("cudaThreadSynchronize()");
Other code fragment:
const size_t t;  // from argument to function
void* p=NULL;
const cudaError_t r=cudaMalloc(&p, t);
if (r!=cudaSuccess) {
    ERROR("cudaMalloc failed.");
}
Partial Backtrace:
Process 9:
cudaDeviceSynchronize()
-> cudaGetExportTable()
   -> __cxa_throw

Process 20:
cudaMalloc()
-> cudaGetExportTable()
   -> cudaGetExportTable()
      -> __cxa_throw
Memory debugging errors:
Processes 0,2,4,6-9,15-17,20-21:
Memory error detected in Malloc_cuda_gx (cudamalloc.cu:35):
dmalloc bad admin structure list. 
This line is the cudaMalloc code fragment shown above. Also:
Processes 1,3,5,10-11,13-14,18-19,23:
Memory error detected in vfprintf from /lib64/libc.so.6:
dmalloc bad admin structure list.
Also, when running on 3 cores/gpus per node instead of 4 gpus per node, dmalloc detects similar memory errors, but when not in debug mode, the code runs perfectly fine with 3 gpus per node (as far as I can tell).
 解决方案 
Recompile with gcc.  (I was using icc to compile my code.)

When you do this, the exception appears when debugging, but continuing past it, I get real CUDA errors:
Process 9: gadget_cuda_gx.cu:116: ERROR in gadget_cuda_gx.cu:919: CUDA ERROR:   cudaThreadSynchronize(): unspecified launch failure
Process 20: cudamalloc.cu:38: ERROR all CUDA-capable devices are busy or unavailable, cudaMalloc failed to allocate 856792 bytes = 0.817101 Mb
Valgrind reveals no memory corruption or leaks in my code (either compiling with gcc or icc), but does find a few leaks in libcudart.

UPDATE: Still not fixed.  Appears to be the same problem reported in answer #2 to this thread: cudaMemset fails on __device__ variable.  The runtime isn't working like it should, it seems...

                        这篇关于cudaGetExportTable（CUDA运行时库）中抛出的异常'cudaError_enum'？的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！


                    
                        查看全文

cudaGetExportTable（CUDA运行时库）中抛出的异常'cudaError_enum'？ [英] Exception 'cudaError_enum' thrown in cudaGetExportTable (CUDA runtime library)?

问题描述

相关文章

其它硬件开发最新文章

热门教程

热门工具

登录关闭

cudaGetExportTable（CUDA运行时库）中抛出的异常'cudaError_enum'？ [英] Exception &#39;cudaError_enum&#39; thrown in cudaGetExportTable (CUDA runtime library)?

问题描述

相关文章

其它硬件开发最新文章

热门教程

热门工具

登录 关闭

cudaGetExportTable（CUDA运行时库）中抛出的异常'cudaError_enum'？ [英] Exception 'cudaError_enum' thrown in cudaGetExportTable (CUDA runtime library)?

登录关闭