如何使用CUDA Fortran在结构中分配数组数组？ [英] How to allocate arrays of arrays in structure with CUDA Fortran?

查看：361 发布时间：2018/3/16 17:30:29 arrays cuda fortran pgi

本文介绍了如何使用CUDA Fortran在结构中分配数组数组？的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

使用CUDA，我试图在一个结构中分配数组，但是我遇到了一个问题，我不知道为什么。所以这里是一个简短的代码（存储在一个名为 struct.cuf 的文件中），它描述了我的问题。我正在使用 PGI 16.10版进行编译，并使用以下选项： -O3 -Muda = cc60 -tp = x64 struct。 cuf -o struct_out

 模块结构
包含
 
 type mytype 
 integer :: alpha，beta，gamma 
 real，dimension（:)，pointer :: a 
结束类型mytype 
 
类型mytypeDevice 
整数:: alpha，beta，gamma 
实数，维（:)，指针，device :: a 
结束类型mytypeDevice 
 
结束模块结构
 
程序main 
使用cudafor 
使用结构
 
 type（mytype）:: T（3）
 type（mytypeDevice），device :: T_Device（3）
 
！对于主机
 do i = 1,3 
 allocate（T（i）％a（10））
 end do 
 T（1）％a = 1; T（2）％A = 2; T（3）％a = 3 
 
！对于设备
 print *，'从现在开始一切正常'
 do i = 1,3 
 allocate（T_Device（i）％a（10））
 end do 
！do i = 1,3 
！ T_Device（i）％a = T（i）％a 
！end do 
 
 end program main

输出错误：

 现在的一切都正常
分段错误

我在这里做错了什么？

我发现（和工作）的唯一解决方案是将值存储在不同的阵列中并将它们传送到GPU，但它非常重。大多数情况下，如果我使用很多像mytype这样的结构。编辑：代码已被修改为使用Vladimir F的解决方案。如果我从 T_Device（3）声明中删除设备属性，那么分配似乎可以，并且也给出值（注释行低于分配）。但是我需要为 T_Device（3）设置设备属性，因为我要在内核中使用它。

谢谢！

解决方案

T_Device 。要使用主机端分配，首先要填充设备结构的主机存储器副本，然后将其复制到设备内存。这：

  type（mytypeDevice）:: T_Device（3）
 
 do i = 1,3 
 allocate（T_Device（i）％a（10））
 end do

将正常工作。这是基于C ++的CUDA代码中的非常标准设计模式，这里的原则是相同的。

With CUDA, I'm trying to allocate arrays in a structure, but I'm having an issue and I don't know why. So here is a short code (stored in a file called struct.cuf) that describe my problem. I'm compiling with the PGI 16.10 version, and I'm using the following options : -O3 -Mcuda=cc60 -tp=x64 struct.cuf -o struct_out
module structure contains type mytype integer :: alpha,beta,gamma real,dimension(:),pointer :: a end type mytype type mytypeDevice integer :: alpha,beta,gamma real,dimension(:),pointer,device :: a end type mytypeDevice end module structure program main use cudafor use structure type(mytype) :: T(3) type(mytypeDevice),device :: T_Device(3) ! For the host do i=1,3 allocate(T(i)%a(10)) end do T(1)%a=1; T(2)%a=2; T(3)%a=3 ! For the device print *, 'Everything from now is ok' do i=1,3 allocate(T_Device(i)%a(10)) end do !do i=1,3 ! T_Device(i)%a=T(i)%a !end do end program main
The output error :
Everything from now is ok Segmentation fault
What I am doing wrong here ?

The only solution I found (and working) is to stored the values in differents arrays and transfers them to the GPU, but it's very "Heavy". Mostly if I use a lot of structures like mytype.

EDIT : Code has been modified to use Vladimir F's solution. If I remove the device attribute from T_Device(3) declaration, then allocation seems ok and giving values too (commented lines below allocation). But I need that device attribute for T_Device(3), because I'm gonna use it in kernels.

Thanks !
解决方案
The problem here is how you have declared T_Device. To use host side allocation you first populate a host memory copy of the device structure, and then copy it to device memory. This:
type(mytypeDevice) :: T_Device(3) do i=1,3 allocate(T_Device(i)%a(10)) end do
will work correctly. This is a very standard design pattern in C++ based CUDA code, and the principle here is identical.

这篇关于如何使用CUDA Fortran在结构中分配数组数组？的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

如何使用CUDA Fortran在结构中分配数组数组？ [英] How to allocate arrays of arrays in structure with CUDA Fortran?

问题描述

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

如何使用CUDA Fortran在结构中分配数组数组？ [英] How to allocate arrays of arrays in structure with CUDA Fortran?

问题描述

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭