如何将多个重复的参数传递给CUDA内核 [英] How to pass multiple duplicated arguments to CUDA Kernel

查看：108 发布时间：2020/10/13 1:42:18 performance cuda gpu gpgpu

本文介绍了如何将多个重复的参数传递给CUDA内核的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我正在寻找一种优雅的方式来在CUDA内核中传递多个重复的参数，

I'm looking for an elegant way to pass multiple duplicated arguments in CUDA kernel,

众所周知，每个内核参数都位于每个堆栈的堆栈中因此，CUDA线程在内核传递给每个线程的参数和位于每个堆栈上的内存之间可能存在重复。

As we all know, each kernel argument is located on the stack of each CUDA thread, therefore, there might be duplication between arguments being passed by the Kernel to each thread, memory which is located on each stack.

为了尽量减少传递的重复参数的数量，我正在寻找一种优雅的方式。

In order to minimize the number of duplicated arguments being passed, I'm looking for an elegant way doing so.

为了解释我的担忧：假设我的代码如下：

In order to explain my concern: Let's say my code looks like this:

   kernelFunction<<<gridSize,blockSize>>>(UINT imageWidth, UINT imageWidth, UINT imageStride, UINT numberOfElements,x,y,ect...)

UINT imageWidth，UINT imageWidth，UINT imageStride，UINT numberOfElements参数位于每个线程库中，

The UINT imageWidth, UINT imageWidth, UINT imageStride, UINT numberOfElements arguments are located at each thread stock ,

我正在寻找一个技巧，以减少参数发送并从其他来源访问数据。

I'm looking for a trick to send less arguments and access the data from other source.

我当时正在考虑使用常量内存，但是由于常量内存位于全局变量中，因此我将其删除。不用说，存储位置应该很快。

I was thinking about using constant memory, but since constant memory is located on the global , I drop it. needless to say that the memory location should be fast.

如何将多个重复的参数传递给CUDA内核 [英] How to pass multiple duplicated arguments to CUDA Kernel

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

如何将多个重复的参数传递给CUDA内核 [英] How to pass multiple duplicated arguments to CUDA Kernel

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭