浮动VS浮动N [英] float VS floatN

查看:70
本文介绍了浮动VS浮动N的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

使用floatN代替OpenCL中的float有什么优势吗?

Is there any advantage when using floatN instead float in OpenCL?

例如

float3 position;

float posX, posY, posZ;

谢谢

推荐答案

这取决于硬件.

NVidia GPU具有标量架构,因此矢量在编写纯标量代码方面几乎没有优势.引用 NVidia OpenCL最佳做法指南(PDF链接) :

NVidia GPUs have a scalar architecture, so vectors provide little advantage on them over writing purely scalar code. Quoting the NVidia OpenCL best practices guide (PDF link):

CUDA体系结构是标量体系结构.因此,没有性能 受益于使用向量类型和指令.这些只能用于 方便.总的来说,拥有更多的工作项比减少使用 大向量.

The CUDA architecture is a scalar architecture. Therefore, there is no performance benefit from using vector types and instructions. These should only be used for convenience. It is also in general better to have more work-items than fewer using large vectors.

使用CPU和ATI GPU,使用矢量将获得更多好处,因为这些架构具有矢量指令(尽管我听说在最新的Radeons上可能有所不同-希望我有阅读本文的链接) ).

With CPUs and ATI GPUs, you will gain more benefits from using vectors as these architectures have vector instructions (though I've heard this might be different on the latest Radeons - wish I had a link to the article where I read this).

引用 ATI Stream OpenCL编程指南(适用于CPU):

Quoting the ATI Stream OpenCL programming guide (PDF link), for CPUs:

CPU(SSE)中的SIMD浮点资源需要使用 向量化类型(float4),以实现打包的SSE代码生成和提取 SIMD硬件具有良好的性能.

The SIMD floating point resources in a CPU (SSE) require the use of vectorized types (float4) to enable packed SSE code generation and extract good performance from the SIMD hardware.

本文提供了性能比较在向量和纯标量类型编写的内核的ATI GPU上.

This article provides a performance comparison on ATI GPUs of a kernel written with vectors vs pure scalar types.

这篇关于浮动VS浮动N的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆