传入CUDAfy结构内的阵列 [英] Passing an array within a structure in CUDAfy

查看:154
本文介绍了传入CUDAfy结构内的阵列的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

使用VS 2012,.NET 4.5,64位和CUDAfy 1.12,我有概念以下证明



<预类=郎-CS prettyprint-覆盖> 使用系统;使用System.Runtime.InteropServices
;
使用Cudafy;
使用Cudafy.Host;
使用Cudafy.Translator;

命名空间测试
{
[Cudafy(eCudafyType.Struct)
[StructLayout(LayoutKind.Sequential)]
公共结构ChildStruct
{
[的MarshalAs(UnmanagedType.LPArray)
公众持股量[] FArray;
众长FArrayLength;
}

[Cudafy(eCudafyType.Struct)
[StructLayout(LayoutKind.Sequential)]
公共结构ParentStruct
{
公ChildStruct儿童;
}

公共类节目
{
[Cudafy]
公共静态无效KernelFunction(GThread gThread,ParentStruct父)
{
长长度= parent.Child.FArrayLength;
}

公共静态无效的主要(字串[] args)
{
VAR模块= CudafyTranslator.Cudafy(
ePlatform.x64,eArchitecture.sm_35 ,
新的[] {typeof运算(ChildStruct)的typeof(ParentStruct)的typeof(程序)});
VAR开发= CudafyHost.GetDevice();
dev.LoadModule(模块);

浮法[] = hostFloat新的浮动[10];
的for(int i = 0; I< hostFloat.Length;我++){hostFloat [我] =我; }

ParentStruct父=新ParentStruct
{
=儿童新ChildStruct
{
FArray = dev.Allocate(hostFloat),
FArrayLength = hostFloat.Length
}
};

dev.Launch(1,1,KernelFunction,父母);

到Console.ReadLine();
}
}
}

当程序运行时,我我得到的dev.Launch以下错误:



键入Test.ParentStruct'不能被封送非托管结构;任何有意义的大小或偏移,可以计算。



如果我从ChildStruct删除float数组,它按预期工作。



具有C / C ++ / CLI和CUDA C在过去的工作,我知道错误的性质。这个错误的一些解决方案建议手动设置的结构尺寸使用的MarshalAs 尺寸参数,但这是不可能的,因为该品种的结构内的类型。



我看着生成的文件.CU,它是产生浮动数组作为浮法* 这是我所期待的。



有没有办法通过一个结构来在内核中的数组?而如果没有什么是最好的第二选择?此问题不会在CUDA C存在,因为我们是从CLR编组它只存在。


解决方案

我花了好时间阅读CUDAfy的源代码,看看是否有这个问题的解决方案。



CUDAfy是试图让事情.NET开发人员过于简单,从的IntPtr 和其他指针概念,保护他们离开。然而,抽象的水平,因此很难认为回答这个问题,没有出现大的重构的方式这个库的作品。



不能够发出一个结构中的float数组是显示塞。我最后做的PInvoke到CUDA运行时和不使用CUDAfy。


Using VS 2012, .NET 4.5, 64bit and CUDAfy 1.12 and I have the following proof of concept

using System;
using System.Runtime.InteropServices;
using Cudafy;
using Cudafy.Host;
using Cudafy.Translator;

namespace Test
{
[Cudafy(eCudafyType.Struct)]
[StructLayout(LayoutKind.Sequential)]
public struct ChildStruct
{
    [MarshalAs(UnmanagedType.LPArray)]
    public float[] FArray;
    public long FArrayLength;
}

[Cudafy(eCudafyType.Struct)]
[StructLayout(LayoutKind.Sequential)]
public struct ParentStruct
{
    public ChildStruct Child;
}

public class Program
{
    [Cudafy]
    public static void KernelFunction(GThread gThread, ParentStruct parent)
    {
        long length = parent.Child.FArrayLength;
    }

    public static void Main(string[] args)
    {
        var module = CudafyTranslator.Cudafy(
          ePlatform.x64, eArchitecture.sm_35,
          new[] {typeof(ChildStruct), typeof(ParentStruct), typeof(Program)});
        var dev = CudafyHost.GetDevice();
        dev.LoadModule(module);

        float[] hostFloat = new float[10];
        for (int i = 0; i < hostFloat.Length; i++) { hostFloat[i] = i; }

        ParentStruct parent = new ParentStruct
        {
            Child = new ChildStruct
            {
                FArray = dev.Allocate(hostFloat),
                FArrayLength = hostFloat.Length
            }
        };

        dev.Launch(1, 1, KernelFunction, parent);

        Console.ReadLine();
    }
}
}

When the program runs, I am getting the following error on the dev.Launch:

Type 'Test.ParentStruct' cannot be marshaled as an unmanaged structure; no meaningful size or offset can be computed.

If I remove the float array from the ChildStruct, it works as expected.

Having worked in C/C++/Cli and CUDA C in the past, I am aware of the nature of the error. Some solutions to this error suggest setting the struct size manually using Size parameter of MarshalAs, but this is not possible due to the variety of types within the struct.

I looked at the generated .cu file and it is generating the float array as a float * which is what I expected.

Is there a way to pass an array within a struct to the Kernel? And if there isn't what is the best second alternative? This problem doesn't exist in CUDA C and it only exists because we are marshaling from CLR.

解决方案

I spent good time reading the source code of CUDAfy to see if there is a solution to this problem.

CUDAfy is trying to make things too simple for .NET developers and shield them away from the IntPtr and other pointer concepts. However, the level of abstraction makes it very hard to think of an answer to this problem without a major refactor to the way this library works.

Not being able to send a float array within a struct is a show stopper. I ended up doing PInvoke to the CUDA Runtime and not using CUDAfy.

这篇关于传入CUDAfy结构内的阵列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆