如何在Visual Studio 2013中启用CUDA 7.0+每线程默认流? [英] How to enable CUDA 7.0+ per-thread default stream in Visual Studio 2013?
问题描述
我遵循了
您对这里发生的事情有任何想法吗?
更新:当我添加 cudaFree时,并发可能发生如下所示。是否由于缺乏同步?
void * launch_kernel(void * dummy)
{
float * data;
cudaMalloc(& data,N * sizeof(float));
内核<< < 1,64>> >(数据,N);
cudaFree(data); //当我添加此行
cudaStreamSynchronize(0);时,可能会发生并发
返回NULL;
}
,编译如下:
nvcc -arch = sm_30-默认流每线程-lpthreadVC2 kernel.cu -o kernel.exe
I followed the method provided in GPU Pro Tip: CUDA 7 Streams Simplify Concurrency and tested it in VS2013 with CUDA 7.5. While the multi-stream example worked, the multi-threading one did not give the expected result. The code is as below:
#include <pthread.h>
#include <cstdio>
#include <cmath>
#define CUDA_API_PER_THREAD_DEFAULT_STREAM
#include "cuda.h"
const int N = 1 << 20;
__global__ void kernel(float *x, int n)
{
int tid = threadIdx.x + blockIdx.x * blockDim.x;
for (int i = tid; i < n; i += blockDim.x * gridDim.x) {
x[i] = sqrt(pow(3.14159, i));
}
}
void *launch_kernel(void *dummy)
{
float *data;
cudaMalloc(&data, N * sizeof(float));
kernel << <1, 64 >> >(data, N);
cudaStreamSynchronize(0);
return NULL;
}
int main()
{
const int num_threads = 8;
pthread_t threads[num_threads];
for (int i = 0; i < num_threads; i++) {
if (pthread_create(&threads[i], NULL, launch_kernel, 0)) {
fprintf(stderr, "Error creating threadn");
return 1;
}
}
for (int i = 0; i < num_threads; i++) {
if (pthread_join(threads[i], NULL)) {
fprintf(stderr, "Error joining threadn");
return 2;
}
}
cudaDeviceReset();
return 0;
}
I also tried to add the macro CUDA_API_PER_THREAD_DEFAULT_STREAM to CUDA C/C++->Host->Preprocessor Definitions, but the result was the same. The timeline generated by the Profiler is as below:
Do you have any idea on what happened here? Many thanks in advance.
Updates: the concurrency may happen when I added a "cudaFree" as shown below. Is it because of the lack of synchronization?
void *launch_kernel(void *dummy)
{
float *data;
cudaMalloc(&data, N * sizeof(float));
kernel << <1, 64 >> >(data, N);
cudaFree(data); // Concurrency may happen when I add this line
cudaStreamSynchronize(0);
return NULL;
}
with the compilation like:
nvcc -arch=sm_30 --default-stream per-thread -lpthreadVC2 kernel.cu -o kernel.exe
这篇关于如何在Visual Studio 2013中启用CUDA 7.0+每线程默认流?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!