创建CUDA上下文的区别 [英] Difference on creating a CUDA context

查看：135 发布时间：2017/3/4 13:29:56 cuda nvidia nvcc

本文介绍了创建CUDA上下文的区别的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有一个程序使用三个内核。为了得到加速，我做一个虚拟内存副本创建一个上下文如下：

I've a program that uses three kernels. In order to get the speedups, I was doing a dummy memory copy to create a context as follows:

__global__ void warmStart(int* f)
{
    *f = 0;
}

这是在我想要的内核之前启动的，如下所示：

which is launched before the kernels I want to time as follows:

int *dFlag = NULL;
cudaMalloc( (void**)&dFlag, sizeof(int) );
warmStart<<<1, 1>>>(dFlag);
Check_CUDA_Error("warmStart kernel");

我还阅读了关于创建上下文的其他最简单的方法： cudaFree ）或 cudaDevicesynchronize（）。但是使用这些API调用给出的时间比使用虚拟内核要差。

I also read about other simplest ways to create a context as cudaFree(0) or cudaDevicesynchronize(). But using these API calls gives worse times than using the dummy kernel.

在强制上下文之后，程序的执行时间对于虚拟内核 0.000031 秒， cudaDeviceSynchronize（）和cudaFree（0）的c $ c> 0.000064 秒。

The execution times of the program, after forcing the context, are 0.000031 seconds for the dummy kernel and 0.000064 seconds for both, the cudaDeviceSynchronize() and cudaFree(0). The times were get as a mean of 10 individual executions of the program.

因此，我得出的结论是，启动一个内核初始化一些没有初始化的东西

Therefore, the conclusion I've reached is that launch a kernel initialize something that is not initialized when creating a context in the canonical way.

那么，使用内核和使用API调用创建上下文有什么区别？

So, what's the difference of creating a context in these two ways, using a kernel and using an API call?

我在Linux下使用CUDA 4.0在GTX480中运行测试。

I run the test in a GTX480, using CUDA 4.0 under Linux.

创建CUDA上下文的区别 [英] Difference on creating a CUDA context

问题描述

推荐答案

相关文章

其它硬件开发最新文章

热门教程

热门工具

登录关闭

创建CUDA上下文的区别 [英] Difference on creating a CUDA context

问题描述

推荐答案

相关文章

其它硬件开发最新文章

热门教程

热门工具

登录 关闭

登录关闭