CUDA math API:函数和内在函数之间的差异 [英] CUDA math API: difference between functions and intrinsics

查看:1364
本文介绍了CUDA math API:函数和内在函数之间的差异的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

根据 CUDA数学APi ,许多数学函数,如sine和余弦,在软件(功能)和硬件(内在)中实现。这些内在函数可能使用GPU的特殊功能单元,那么软件实现的要点是什么?是不是比硬件实现慢?

According to the CUDA math APi, many mathematical functions, like sine and cosine, are implemented both in software (functions) and in hardware (intrinsics). These intrinsics probably use the Special Function Units of the GPU, so what is the point of the software implementation? Isn't that slower than the hardware implementation?

推荐答案

更好的问题是内在点是什么? 。

The better question to ask is "what is the point of the intrinsics?".

答案在于附录D 。超越,三角函数和特殊函数的内在函数更快,但是具有更多的域限制,并且通常比它们的软件对应者具有更低的准确性。对于硬件(即图形)的主要目的,具有sin,cos,平方根,倒数等的快速近似函数允许在最终的数学准确度不是关键的时改善着色器性能。对于一些计算任务,较不精确的版本也很好。对于其他应用程序,内在函数可能不足。

The answer lies in Appendix D of the programming guide. The intrinsics for the transcendental, trigonometric, and special functions are faster, but have more domain restrictions and generally lower accuracy than their software counterparts. For the primary purpose of the hardware (ie graphics), having fast approximate functions for sin, cos, square root, reciprocal, etc. allows for improved shader performance when ultimate mathematical accuracy is not critical. For some compute tasks, the less accurate versions are also fine. For other applications, the intrinsics may not be sufficient.

两者都允许知情的程序员有选择:速度或准确性。

Having both allows the informed programmer to have a choice: speed or accuracy.

这篇关于CUDA math API:函数和内在函数之间的差异的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆