fp16支持cuda推力 [英] fp16 support in cuda thrust

查看:658
本文介绍了fp16支持cuda推力的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我无法找到任何关于fp16支持推力cuda模板库。
即使路线图页面没有什么:
https:// github。 com / thrust / thrust / wiki / Roadmap

I am not able to found anything about the fp16 support in thrust cuda template library. Even the roadmap page has nothing about it: https://github.com/thrust/thrust/wiki/Roadmap

但我假设有人可能想出了如何克服这个问题,因为cuda中的fp16支持是超过6个月。

But I assume somebody has probably figured out how to overcome this problem, since the fp16 support in cuda is around for more than 6 month.

截至今天,我在我的代码中非常依赖于推力,并且模板化几乎每一个我使用的类,以缓解fp16集成,不幸的是,绝对没有什么工作开箱即用这个简单的示例代码:

As of today, I heavily rely on thrust in my code, and templated nearly every class I use in order to ease fp16 integration, unfortunately, absolutely nothing works out of the box for half type even this simple sample code:

//STL
#include <iostream>
#include <cstdlib>

//Cuda
#include <cuda_runtime_api.h>
#include <thrust/device_vector.h>
#include <thrust/reduce.h>
#include <cuda_fp16.h>
#define T half //work when float is used

int main(int argc, char* argv[])
{
        thrust::device_vector<T> a(10,1.0f);
        float t = thrust::reduce( a.cbegin(),a.cend(),(float)0);
        std::cout<<"test = "<<t<<std::endl;
        return EXIT_SUCCESS;
}

此代码无法编译,因为看起来没有从float到一半或一半浮动。但是,似乎在cuda中有内在性允许显式转换。

This code cannot compile because it seems that there is no implicit conversion from float to half or half to float. However, it seems that there are intrinsics in cuda that allow for an explicit conversion.

为什么我不能简单地在cuda的一些头文件中重载一半和float构造函数,以添加前面的内在函数: p>

Why can't I simply overload the half and float constructor in some header file in cuda, to add the previous intrinsic like that :

float::float( half a )
{
  return  __half2float( a ) ;
}

half::half( float a )
{
  return  __float2half( a ) ;
}

我的问题可能看起来很基本,但我不明白为什么我没有

My question may seem basic but I don't understand why I haven't found much documentation about it.

提前感谢。

推荐答案

非常简单的答案是,你正在寻找不存在。

The very short answer is that what you are looking for doesn't exist.

稍长的答案是,推力意图在根本 POD 类型,而CUDA fp16 half 不是POD类型。可以可以做出两个自定义类(一个用于主机,一个用于设备),其实现所有需要的对象语义和算术运算符,以与推力一起正确地工作,但是它不会是微不足道的努力做到这一点(并且它将需要写或修改现有的FP16主机库)。

The slightly longer answer is that thrust is intended to work on fundamental and POD types only, and the CUDA fp16 half is not a POD type. It might be possible to make two custom classes (one for the host and one for the device) which implements all the required object semantics and arithmetic operators to work correctly with thrust, but it would not be an insignificant effort to do it (and it would require writing or adapting an existing FP16 host library).

请注意,目前的FP16支持仅在设备代码中,并且仅在计算5.3和更新的设备上。因此,除非您有Tegra TX1,否则无法使用设备代码中的FP16库。

Note also that the current FP16 support is only in device code and only on compute 5.3 and newer devices. So unless you have a Tegra TX1, you can't use the FP16 library in device code anyway.

这篇关于fp16支持cuda推力的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆