CUDA：如何将多个重复的参数传递给CUDA内核 [英] CUDA: How to pass multiple duplicated arguments to CUDA Kernel

查看：549 发布时间：2017/3/5 19:36:12 performance cuda gpu gpgpu

本文介绍了CUDA：如何将多个重复的参数传递给CUDA内核的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我正在寻找一种方法来传递多个重复的参数在CUDA内核，

I'm looking for an elegent way to pass multiple duplicated arguments in CUDA kernel,

我们都知道，每个内核参数位于每个CUDA线程，因此，内核传递给每个线程的参数之间可能存在重复，每个线程都位于每个堆栈上。

As we all know, each kernel argument is located on the stack of each CUDA thread, therefore, there might be duplication between arguments being passed by the Kernel to each thread, memory which is located on each stack.

为了最小化传递的重复参数的数量，我寻找一个优雅的方式这样做。

In order to minimize the number of duplicated arguments being passed, I'm looking for an elegant way doing so.

为了解释我的担心：让我们假设我的代码看起来像这样：

In order to explain my concern: Let's say my code looks like this:

   kernelFunction<<<gridSize,blockSize>>>(UINT imageWidth, UINT imageWidth, UINT imageStride, UINT numberOfElements,x,y,ect...)

UINT imageWidth，UINT imageWidth，UINT imageStride，UINT numberOfElements个参数位于每个线程库，

The UINT imageWidth, UINT imageWidth, UINT imageStride, UINT numberOfElements arguments are located at each thread stock ,

我正在寻找一个技巧发送更少的参数和访问来自其他来源的数据。

I'm looking for a trick to send less arguments and access the data from other source.

我正在考虑使用常量内存，但由于常量内存位于全局，我放弃它。不用说内存位置应该快。

I was thinking about using constant memory, but since constant memory is located on the global , I drop it. needless to say that the memory location should be fast.

任何帮助将不胜感激。

Any help would be appreciated.

CUDA：如何将多个重复的参数传递给CUDA内核 [英] CUDA: How to pass multiple duplicated arguments to CUDA Kernel

问题描述

推荐答案

相关文章

其它硬件开发最新文章

热门教程

热门工具

登录关闭

CUDA：如何将多个重复的参数传递给CUDA内核 [英] CUDA: How to pass multiple duplicated arguments to CUDA Kernel

问题描述

推荐答案

相关文章

其它硬件开发最新文章

热门教程

热门工具

登录 关闭

登录关闭