当我在CUDA 5.5中启动内核函数时发生错误 [英] Errors that occur when I start the kernel function in CUDA 5.5

查看:124
本文介绍了当我在CUDA 5.5中启动内核函数时发生错误的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我安装了CUDA5.5。
开发环境使用Visual Studio 2010 Professional。
我试图运行像下面的源代码。
然而,红线画在<<<由于某种原因在Visual Studio。
显示错误expression。Required和。
如果有任何相同的现象发生,请告诉我如何解决。

I installed CUDA5.5. Development environment is using Visual Studio 2010 Professional. And I tried to run the source code like the following. However, the red line was drawn to the part of "<<<" for some reason on Visual Studio. It is displayed Error "expression. Required" and. If anyone the same phenomenon is happening, please tell me how to solve.

开发环境-------------- -------------------------------------------------- ----------

Development environment--------------------------------------------------------------------------

         OS:Windows7 64bit
         Visual Studio 2010 Professional SP1
         CUDA 5.5

现象----------------- -------------------------------------------------- ----------------------

Phenomenon-----------------------------------------------------------------------------------------

↓下划线的红色部分的<< 。的源代码,你会看到以下。
然而,第三个下划线只出现<。
它似乎是:expression必需的错误,并将鼠标指针移动到红线的位置。

↓Underlined red part of the "<<<" of source code you'll see the following. However, the third underline appears only "<". It appears to be: "expression Required. Error" and move the mouse pointer to the location of the red line.

源代码----- -------------------------------------------------- --------------------

Source code---------------------------------------------------------------------------

#include <cuda_runtime.h>
#include <stdio.h> 
#include <math.h> 
#include <cuda.h> 

#define N 256


__global__ void matrix_vector_multi_gpu_1_1(float *A_d, float *B_d, float *C_d){
    int i,j;

    for(j=0;j<N;j++){
        A_d[j]=0.0F;
        for(i=0;i<N;i++){
            A_d[j]=A_d[j]+B_d[j*N+i]*C_d[i];
        }
    }
  }

int main(){
    int i,j;
    float A[N], B[N*N], C[N];
    float *A_d, *B_d, *C_d;

    dim3 blocks(1,1,1);
    dim3 threads(1,1,1);

    for(j=0;j<N;j++){
        for(i=0;i<N;i++){
            B[j*N+i]=((float)j)/256.0;
        }
    }

    for(j=0;j<N;j++){
        C[j]=1.0F;
    }

    cudaMalloc((void**)&A_d, N*sizeof(float));
    cudaMalloc((void**)&B_d, N*N*sizeof(float));
    cudaMalloc((void**)&C_d, N*sizeof(float));

    cudaMemcpy(A_d,A,N*sizeof(float),cudaMemcpyHostToDevice);
    cudaMemcpy(B_d,B,N*N*sizeof(float),cudaMemcpyHostToDevice);
    cudaMemcpy(C_d,C,N*sizeof(float),cudaMemcpyHostToDevice);

    matrix_vector_multi_gpu_1_1<<<blocks,threads>>>(A_d,B_d,C_d);

    cudaMemcpy(A,A_d,N*sizeof(float),cudaMemcpyDeviceToDevice);

    for(j=0;j<N;j++){
        printf("A[ %d ]=%f \n",j,A[j]);
    }
    getchar();
    cudaFree(A_d);
    cudaFree(B_d);
    cudaFree(C_d);
    return 0;
}

发生地点

推荐答案

至少从

cudaMemcpy(A,A_d,N*sizeof(float),cudaMemcpyDeviceToDevice);

cudaMemcpy(A,A_d,N*sizeof(float),cudaMemcpyDeviceToHost);

另外几个建议


  • 运行一些CUDA示例代码,以查看您是否正确设置了CUDA。

  • 确保您的源代码文件具有外部名称 .cu

解决 cudaMemcpyDeviceToDevice 问题后,并运行您的代码。结果得到纠正。你的代码应该没有阻止编译的问题。

After solving the cudaMemcpyDeviceToDevice issue, I can compile and run your code. And the result is corrected. You code should have no problem that prevent compiling.

这篇关于当我在CUDA 5.5中启动内核函数时发生错误的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆