使用 nvcc CUDA 编译器时，有哪些可能导致分段错误的原因? [英] What are some possible causes of a segmentation fault when using the nvcc CUDA compiler?

查看：39 发布时间：2022/1/10 16:09:19 compiler-construction cuda segmentation-fault nvcc

本文介绍了使用 nvcc CUDA 编译器时，有哪些可能导致分段错误的原因?的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有一个 CUDA 类，我们称它为 A，在头文件中定义.我已经编写了一个测试内核，它创建了一个类 A 的实例，它可以很好地编译并产生预期的结果.

I have a CUDA class, let's call it A, defined in a header file. I have written a test kernel which creates an instance of class A, which compiles fine and produces the expected result.

此外，我有我的主 CUDA 内核，它也可以很好地编译并产生预期的结果.但是，当我将代码添加到主内核以实例化类 A 的实例时，nvcc 编译器会因分段错误而失败.

In addition, I have my main CUDA kernel, which also compiles fine and produces the expected result. However, when I add code to my main kernel to instantiate an instance of class A, the nvcc compiler fails with a segmentation fault.

更新:

为了澄清，分段错误发生在编译期间，而不是在运行内核时.我用来编译的行是:

To clarify, the segmentation fault happens during compilation, not when running the kernel. The line I am using to compile is:

`nvcc --cubin -arch compute_20 -code sm_20 -I<My include dir> --keep kernel.cu`

其中 <My include dir> 是包含一些实用程序头文件的本地路径的路径.

where <My include dir> is the path to my local path containing some utility header files.

我的问题是，在花费大量时间隔离一个展示行为的最小示例之前(由于代码库相对较大，这不是微不足道的)，有没有人遇到过类似的问题?如果内核太长或使用的寄存器太多，nvcc 编译器是否有可能失败并死掉?

My question is, before spending a lot of time isolating a minimal example exhibiting the behaviour (not trivial, due to relatively large code base), has anyone encountered a similar issue? Would it be possible for the nvcc compiler to fail and die if the kernel is either too long or uses too many registers?

如果诸如寄存器计数之类的问题会以这种方式影响编译器，那么我将需要重新考虑如何实现我的内核以使用更少的资源.这也意味着将事情精简到最小的例子可能会使问题消失.但是，如果这根本不可能，我不想在死胡同上浪费时间，而是会尝试将事情缩减到最小的示例，并向 NVIDIA 提交错误报告.

If an issue such as register count can affect the compiler this way, then I will need to rethink how to implement my kernel to use fewer resources. This would also mean that trimming things down to a minimal example will likely make the problem disappear. However, if this is not even a possibility, I don't want to waste time on a dead-end, but will rather try to cut things down to a minimal example and will file a bug report to NVIDIA.

更新:

根据@njuffa 的建议，我在启用 -v 标志的情况下重新运行编译.输出以以下内容结束:

As per the suggestion of @njuffa, I reran the compilation with the -v flag enabled. The output ends with the following:

#$ ptxas  -arch=sm_20 -m64 -v  "/path/to/kernel_ptx/kernel.ptx"  -o "kernel.cubin" 
Segmentation fault
# --error 0x8b --

这表明问题是由于 ptxas 程序无法从 ptx 文件生成 CUDA 二进制文件.

This suggests the problem is due to the ptxas program, which is failing to generate a CUDA binary from the ptx file.

使用 nvcc CUDA 编译器时，有哪些可能导致分段错误的原因? [英] What are some possible causes of a segmentation fault when using the nvcc CUDA compiler?

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

使用 nvcc CUDA 编译器时，有哪些可能导致分段错误的原因? [英] What are some possible causes of a segmentation fault when using the nvcc CUDA compiler?

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭