模板__host__ __device__调用主机定义的函数 [英] Template __host__ __device__ calling host defined functions

查看:1936
本文介绍了模板__host__ __device__调用主机定义的函数的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在实现CUDA代码期间,我经常需要一些实用程序函数,这些函数将从设备和主机代码调用。所以我声明这些函数为 __ host__ __device __ 。这是正常的,可能的设备/主机不兼容性可以通过 #ifdef CUDA_ARCH 来处理。

During implementation of CUDA code I often need some utility functions, which will be called from device and also from host code. So I declare these functions as __host__ __device__. This is OK and possible device/host incompabilities can be handled by #ifdef CUDA_ARCH.

问题出现在效用函数为模板时。由一些函子类型。如果模板实例调用 __ host __ 函数,我会收到此警告:

Problems come when the utility function is templated ie. by some functor type. If the template instance calls a __host__ function I get this warning:

calling a __host__ function from a __host__ __device__ function is not allowed
      detected during instantiation of "int foo(const T &) [with T=HostObject]" 

我知道的解决方案是定义函数两次 - 一次为设备和一次为主机代码不同的名称(我不能重载 __ host__ __device __ )。但是这意味着有代码重复和所有其他 __ host__ __device __ 函数将调用它,也必须定义两次(甚至更多的代码重复)。

Only solution I know is to define the function twice - once for device and once for host code with different name (I cannot overload on __host__ __device__). But this means that there is code duplication and all other __host__ __device__ functions which will call it, must be also defined twice (even more code duplication).

简化示例:

#include <cuda.h>
#include <iostream>

struct HostObject {
    __host__ 
    int value() const { return 42; }
};

struct DeviceObject {
    __device__ 
    int value() const { return 3; }
};

template <typename T> 
__host__ __device__ 
int foo(const T &obj) {
    return obj.value();
}

/*
template <typename T> 
__host__ 
int foo_host(const T &obj) {
    return obj.value();
}

template <typename T> 
__device__ 
int foo_device(const T &obj) {
    return obj.value();
}
*/

__global__ void kernel(int *data) {
    data[threadIdx.x] = foo(DeviceObject());
}

int main() {
    foo(HostObject());

    int *data;
    cudaMalloc((void**)&data, sizeof(int) * 64);
    kernel<<<1, 64>>>(data);
    cudaThreadSynchronize();
    cudaFree(data);
}

警告是由 foo ); c> $($

Warning is caused by the foo(HostObject()); call inside the main() function.

foo_host<> foo_device<> 可能替换有问题的 foo<>

foo_host<> and foo_device<> are possible replacements for the problematic foo<>.

有更好的解决方案吗?我可以阻止设备侧的 foo()实例化吗?

Is there a better solution? Can I prevent instantion of foo() on the device side?

推荐答案

您不能阻止实例化 __ host__ __device __ 函数模板实例化的一半。如果你通过在主机(设备)上调用函数来实例化函数,编译器也会实例化设备(主机)一半。

You cannot prevent instantiation of either half of a __host__ __device__ function template instantiation. If you instantiate the function by calling it on the host (device), the compiler will also instantiate the device (host) half.

从CUDA 7.0起的情况是使用 #pragma hd_warning_disable 来抑制警告,如下面的示例所示,并确保该函数未被错误地调用。

The best you can do for your use case as of CUDA 7.0 is to suppress the warning using #pragma hd_warning_disable as in the following example and ensure that the function is not called incorrectly.

#include <iostream>
#include <cstdio>

#pragma hd_warning_disable
template<class Function>
__host__ __device__
void invoke(Function f)
{
  f();
}

struct host_only
{
  __host__
  void operator()()
  {
    std::cout << "host_only()" << std::endl;
  }
};

struct device_only
{
  __device__
  void operator()()
  {
    printf("device_only(): thread %d\n", threadIdx.x);
  }
};

__global__
void kernel()
{
  // use from device with device functor
  invoke(device_only());

  // XXX error
  // invoke(host_only());
}

int main()
{
  // use from host with host functor
  invoke(host_only());

  kernel<<<1,1>>>();
  cudaDeviceSynchronize();

  // XXX error
  // invoke(device_only());

  return 0;
}

这篇关于模板__host__ __device__调用主机定义的函数的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆