cudaMemcpy()vs cudaMemcpyFromSymbol() [英] cudaMemcpy() vs cudaMemcpyFromSymbol()

查看:672
本文介绍了cudaMemcpy()vs cudaMemcpyFromSymbol()的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想知道为什么 cudaMemcpyFromSymbol()存在。看来一切'符号'func可以做,nonSymbol cmds可以做。

I'm trying to figure out why cudaMemcpyFromSymbol() exists. It seems everything that 'symbol' func can do, the nonSymbol cmds can do.

符号func看起来容易移动数组或索引的一部分,但这可以很容易地通过nonSymbol函数来完成。我怀疑nonSymbol方法将运行更快,因为没有符号查找需要。 (不清楚是否在编译或运行时完成符号查找计算。)

The symbol func appears to make it easy for part of an array or index to be moved, but this could just as easily be done with the nonSymbol function. I suspect the nonSymbol approach will run faster as there is no symbol-lookup needed. (It is not clear if the symbol look up calculation is done at compile or run time.)

为什么要使用 cudaMemcpyFromSymbol code> vs cudaMemcpy()?

推荐答案

cudaMemcpyFromSymbol 是从设备内存中任何静态定义的变量复制的规范方式。

cudaMemcpyFromSymbol is the canonical way to copy from any statically defined variable in device memory.

cudaMemcpy 无法直接用于复制到静态定义的设备变量,指针,并且在运行时宿主代码是未知的。因此,需要可以询问设备上下文符号表的API调用。两个选择是 cudaMemcpyFromSymbol ,它在一个操作中执行符号查找和复制,或 cudaGetSymbolAddress 它可以传递到 cudaMemcpy 。前者可能更有效率,如果你只想做一个副本,后者如果你想在主机代码中多次使用地址。

cudaMemcpy can't be directly use to copy to or from a statically defined device variable because it requires a device pointer, and that isn't known to host code at runtime. Therefore, an API call which can interrogate the device context symbol table is required. The two choices are either, cudaMemcpyFromSymbol which does the symbol lookup and copy in one operation, or cudaGetSymbolAddress which returns an address which can be passed to cudaMemcpy. The former is probably more efficient if you only want to do one copy, the latter if you want to use the address multiple times in host code.

这篇关于cudaMemcpy()vs cudaMemcpyFromSymbol()的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆