有什么方法来优化CUDA中的sincos调用吗? [英] Is there any way to optimize sincos calls in CUDA?
问题描述
我在CUDA中编写一个程序,使用双精度对 sincos()
函数进行大量调用。恐怕这是代码最大的瓶颈之一,我不能减少对函数的调用次数。
I'm writing a program in CUDA that makes a huge amount of calls to the sincos()
function, using double precision. I'm afraid this is one of the biggest bottlenecks of the code, and I cannot reduce the number of calls to the function.
是否有适当的近似值 sincos
在CUDA或库中我可以导入?我也很关心准确性,所以更好的近似是,我的代码将更快乐。
Is there any decent approximation to sincos
in CUDA or in a library I can import? I am also quite concerned with the accuracy, so the better the approximation is, the happier my code will be.
我也考虑构建一个查找表或近似
I've also thought about building a lookup table or approximating the values with their taylor series, but I want some opinions before going down that road.
推荐答案
一个相当快速和准确的 sincos
函数位于 CUDA math api 。只需包含 math.h
。或使用 sincosf
(这里)如果这将为你工作。 (我知道你的问题是双精度的,只是指出一些东西。)
A pretty fast and accurate sincos
function is available in the CUDA math api. Just include math.h
. Or use sincosf
(here) if that will work for you. (I'm aware that you said double precision in your question. Just pointing some things out.)
如果你可以使用 sincospif
而不是 sincosf
,@njuffa工作他的魔法这里,您可能会感兴趣。
If you can use sincospif
instead of sincosf
, @njuffa has worked his magic here, which may interest you.
This question and this question may also interest you.
这篇关于有什么方法来优化CUDA中的sincos调用吗?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!