CUDA float2合并 [英] CUDA float2 coalescing

查看：207 发布时间：2020/10/13 1:31:41 cuda

本文介绍了CUDA float2合并的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

在CUDA中使用float2数据类型时，合并读取时遇到问题。

I'm having trouble coalescing reads when using the float2 datatype in CUDA.

我试图做一个简单的示例在可视化探查器中运行，但是它总是返回非终止读取。如果有人能对此有所阐明，我将非常感谢。

I've tried to make a simple example to run in the visual profiler but it always returns noncoalesced reads. If anyone could shed some light on this I would be really grateful, thanks.

#include <stdio.h>
#include <cuda_runtime_api.h>

__global__ void kernel(float2 *in, float2 *out) {
        int idx=blockIdx.x*blockDim.x+threadIdx.x;
        float2 d=in[idx];
        d.x = 100.f;

        out[idx] = d;
}

int main() {
  const int dataSize=32;
  float2 *in;
  cudaMalloc((void**)&in,dataSize*sizeof(float2));

  float2 *out;
  cudaMalloc((void**)&out,dataSize*sizeof(float2));
  kernel<<<1,32>>>(in,out);
  return 0;
}

CUDA float2合并 [英] CUDA float2 coalescing

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

CUDA float2合并 [英] CUDA float2 coalescing

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭