精度与Sum浮点减和内核 [英] Precision in Sum reduction kernel with floats

查看：154 发布时间：2017/3/4 13:34:04 cuda

本文介绍了精度与Sum浮点减和内核的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我创建了一个调用Nvidia的Sum Reduction内核（reduce6）的例程，但是当我比较CPU和GPU之间的结果时，会出现一个随着向量大小增加而增加的错误：

I am creating a routine that calls the Sum Reduction kernel of Nvidia (reduction6), but when I compare the results between the CPU and GPU get an error that increases as the vector size increases, so:

CPU和GPU降低都是浮动的

Both CPU and GPU reductions are floats

Size: 1024  (Blocks : 1,  Threads : 512)
Reduction on CPU:  508.1255188 
Reduction on GPU:  508.1254883 
Error:  6.0059137e-06

Size: 16384 (Blocks : 8, Threads : 1024)
Reduction on CPU:  4971.3193359 
Reduction on GPU:  4971.3217773 
Error:  4.9109825e-05

Size: 131072 (Blocks : 64, Threads : 1024)
Reduction on CPU:  49986.6718750 
Reduction on GPU:  49986.8203125 
Error:  2.9695415e-04

Size: 1048576 (Blocks : 512, Threads : 1024)
Reduction on CPU:  500003.7500000 
Reduction on GPU:  500006.8125000 
Error:  6.1249541e-04

关于此错误的任何想法？，谢谢。

Any idea about this error?, thanks.

精度与Sum浮点减和内核 [英] Precision in Sum reduction kernel with floats

问题描述

推荐答案

相关文章

其它硬件开发最新文章

热门教程

热门工具

登录关闭

精度与Sum浮点减和内核 [英] Precision in Sum reduction kernel with floats

问题描述

推荐答案

相关文章

其它硬件开发最新文章

热门教程

热门工具

登录 关闭

登录关闭