CUDA 运行时错误 (59):触发设备端断言 [英] CUDA runtime error (59) : device-side assert triggered

查看:32
本文介绍了CUDA 运行时错误 (59):触发设备端断言的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我可以访问 Tesla K20c,我正在 CIFAR10 数据集上运行 ResNet50...然后我得到的错误是:

I have access to Tesla K20c, I am running ResNet50 on CIFAR10 dataset... Then I get the error as:

THCudaCheck FAIL file=/opt/conda/conda-bld/pytorch_1524584710464/work/aten/src/THC/generated/../generic/THCTensorMathPointwise.cu line=265 error=59 : device-side assert triggered
Traceback (most recent call last):
  File "main.py", line 109, in <module>
    train(loader_train, model, criterion, optimizer)
  File "main.py", line 54, in train
    optimizer.step()
  File "/usr/local/anaconda35/lib/python3.6/site-packages/torch/optim/sgd.py", line 93, in step
    d_p.add_(weight_decay, p.data)
RuntimeError: cuda runtime error (59) : device-side assert triggered at /opt/conda/conda-bld/pytorch_1524584710464/work/aten/src/THC/generated/../generic/THCTensorMathPointwise.cu:265

如何解决这个错误?

推荐答案

一般来说,当遇到cuda runtine errors时,建议使用CUDA_LAUNCH_BLOCKING=1再次运行你的程序 标志以获得准确的堆栈跟踪.

In general, when encountering cuda runtine errors, it is advisable to run your program again using the CUDA_LAUNCH_BLOCKING=1 flag to obtain an accurate stack trace.

在您的特定情况下,您的数据目标对于指定数量的类来说过高(或过低).

In your specific case, the targets of your data were too high (or low) for the specified number of classes.

这篇关于CUDA 运行时错误 (59):触发设备端断言的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆