如何在Visual Studio 2013中启用CUDA 7.0+每线程默认流? [英] How to enable CUDA 7.0+ per-thread default stream in Visual Studio 2013?

查看:212
本文介绍了如何在Visual Studio 2013中启用CUDA 7.0+每线程默认流?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我遵循了


您对这里发生的事情有任何想法吗?

解决方案

更新:当我添加 cudaFree时,并发可能发生如下所示。是否由于缺乏同步?

  void * launch_kernel(void * dummy)
{
float * data;
cudaMalloc(& data,N * sizeof(float));

内核<< < 1,64>> >(数据,N);
cudaFree(data); //当我添加此行
cudaStreamSynchronize(0);时,可能会发生并发

返回NULL;
}

,编译如下:

  nvcc -arch = sm_30-默认流每线程-lpthreadVC2 kernel.cu -o kernel.exe 


I followed the method provided in GPU Pro Tip: CUDA 7 Streams Simplify Concurrency and tested it in VS2013 with CUDA 7.5. While the multi-stream example worked, the multi-threading one did not give the expected result. The code is as below:

#include <pthread.h>
#include <cstdio>
#include <cmath>

#define CUDA_API_PER_THREAD_DEFAULT_STREAM

#include "cuda.h"

const int N = 1 << 20;

__global__ void kernel(float *x, int n)
{
    int tid = threadIdx.x + blockIdx.x * blockDim.x;
    for (int i = tid; i < n; i += blockDim.x * gridDim.x) {
        x[i] = sqrt(pow(3.14159, i));
    }
}

void *launch_kernel(void *dummy)
{
    float *data;
    cudaMalloc(&data, N * sizeof(float));

    kernel << <1, 64 >> >(data, N);

    cudaStreamSynchronize(0);

    return NULL;
}

int main()
{
    const int num_threads = 8;

    pthread_t threads[num_threads];

    for (int i = 0; i < num_threads; i++) {
        if (pthread_create(&threads[i], NULL, launch_kernel, 0)) {
            fprintf(stderr, "Error creating threadn");
            return 1;
        }
    }

    for (int i = 0; i < num_threads; i++) {
        if (pthread_join(threads[i], NULL)) {
            fprintf(stderr, "Error joining threadn");
            return 2;
        }
    }

    cudaDeviceReset();

    return 0;
}

I also tried to add the macro CUDA_API_PER_THREAD_DEFAULT_STREAM to CUDA C/C++->Host->Preprocessor Definitions, but the result was the same. The timeline generated by the Profiler is as below:

Do you have any idea on what happened here? Many thanks in advance.

解决方案

Updates: the concurrency may happen when I added a "cudaFree" as shown below. Is it because of the lack of synchronization?

void *launch_kernel(void *dummy)
{
    float *data;
    cudaMalloc(&data, N * sizeof(float));

    kernel << <1, 64 >> >(data, N);
    cudaFree(data); // Concurrency may happen when I add this line
    cudaStreamSynchronize(0);

    return NULL;
}

with the compilation like:

nvcc -arch=sm_30  --default-stream per-thread -lpthreadVC2 kernel.cu -o kernel.exe

这篇关于如何在Visual Studio 2013中启用CUDA 7.0+每线程默认流?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆