JCuda中的JIT,加载多个ptx模块 [英] JIT in JCuda, loading multiple ptx modules

查看:16
本文介绍了JCuda中的JIT,加载多个ptx模块的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在 这个问题中说过我有一些在 JCuda 中加载 ptx 模块时出现问题,在 @talonmies 的想法之后,我实现了他的解决方案的 JCuda 版本来加载多个 ptx 文件并将它们作为单个模块加载.这是代码的相关部分:

I said in this question that I had some problem loading ptx modules in JCuda and after @talonmies's idea, I implemented a JCuda version of his solution to load multiple ptx files and load them as a single module. Here is the related part of the code:

import static jcuda.driver.JCudaDriver.cuLinkAddFile;
import static jcuda.driver.JCudaDriver.cuLinkComplete;
import static jcuda.driver.JCudaDriver.cuLinkCreate;
import static jcuda.driver.JCudaDriver.cuLinkDestroy;
import static jcuda.driver.JCudaDriver.cuModuleGetFunction;
import static jcuda.driver.JCudaDriver.cuModuleLoadData;

import jcuda.driver.CUjitInputType;
import jcuda.driver.JITOptions;
import jcuda.driver.CUlinkState;
import jcuda.driver.CUfunction;

public class JCudaTestJIT{

    private CUmodule module;
    private CUfunction functionKernel;

    public void prepareModule(){
        String ptxFileName4 = "file4.ptx";
        String ptxFileName3 = "file3.ptx";
        String ptxFileName2 = "file2.ptx";
        String ptxFileName1 = "file1.ptx";

        CUlinkState linkState = new CUlinkState();
        JITOptions jitOptions = new JITOptions();
        cuLinkCreate(jitOptions, linkState);

        cuLinkAddFile(linkState, CUjitInputType.CU_JIT_INPUT_PTX, ptxFileName4, jitOptions);
        cuLinkAddFile(linkState, CUjitInputType.CU_JIT_INPUT_PTX, ptxFileName3, jitOptions);
        cuLinkAddFile(linkState, CUjitInputType.CU_JIT_INPUT_PTX, ptxFileName2, jitOptions);
        cuLinkAddFile(linkState, CUjitInputType.CU_JIT_INPUT_PTX, ptxFileName1, jitOptions);

        long sizeOut = 32768;
        byte[] image = new byte[32768];

        Pointer cubinOut = Pointer.to(image);

        cuLinkComplete(linkState, cubinOut, (new long[]{sizeOut}));

        module = new CUmodule();

        // Load the module from the image buffer
        cuModuleLoadData(module, cubinOut.getByteBuffer(0, 32768).array());

        cuLinkDestroy(linkState);

        functionKernel = new CUfunction();
        cuModuleGetFunction(functionKernel, module, "kernel");
    }

    // Other methods 
}

但是我在调​​用 cuModuleLoadData 方法时得到了 CUDA_ERROR_INVALID_IMAGE 的错误.在调试的时候,我看到调用cuLinkComplete方法并将image数组作为输出传递后,数组仍然没有变化,清晰.我是否正确传递了输出参数?这是在 JCuda 中通过引用传递变量的方式吗?

But I got the error of CUDA_ERROR_INVALID_IMAGE at calling cuModuleLoadData method. While debugging it, I saw that after calling cuLinkComplete method and pass the image array as the output, the array is still unchanged and clear. Am I passing the output parameter correctly? Is this how one can pass a variable by reference in JCuda?

推荐答案

直到 30 分钟前我才写过一行 Java 代码,更不用说以前用过 JCUDA,而是几乎字面意思的逐行翻译我给你 这里 的原生 C++ 代码似乎运行良好:

I had never written a single line of Java code until 30 minutes ago, let alone used JCUDA before, but an almost literal line-by-line translation of the native C++ code I gave you here seems to work perfectly:

import static jcuda.driver.JCudaDriver.*;
import java.io.*;
import jcuda.*;
import jcuda.driver.*;

public class JCudaRuntimeTest
{
    public static void main(String args[])
    {
        JCudaDriver.setExceptionsEnabled(true);

        cuInit(0);
        CUdevice device = new CUdevice();
        cuDeviceGet(device, 0);
        CUcontext context = new CUcontext();
        cuCtxCreate(context, 0, device);

        CUlinkState linkState = new CUlinkState();
        JITOptions jitOptions = new JITOptions();
        cuLinkCreate(jitOptions, linkState);

        String ptxFileName2 = "test_function.ptx";
        String ptxFileName1 = "test_kernel.ptx";

        cuLinkAddFile(linkState, CUjitInputType.CU_JIT_INPUT_PTX, ptxFileName2, jitOptions);
        cuLinkAddFile(linkState, CUjitInputType.CU_JIT_INPUT_PTX, ptxFileName1, jitOptions);

        long sz[] = new long[1];
        Pointer image = new Pointer();
        cuLinkComplete(linkState, image, sz);
        System.out.println("Pointer: " + image);
        System.out.println("CUBIN size: " + sz[0]);

        CUmodule module = new CUmodule();
        cuModuleLoadDataEx(module, image, 0, new int[0], Pointer.to(new int[0]));   
        cuLinkDestroy(linkState);

        CUfunction functionKernel = new CUfunction();
        String kernelname = "_Z6kernelPfS_S_S_";
        cuModuleGetFunction(functionKernel, module, kernelname);
        System.out.println("Function: " + functionKernel);
    }
}

它的工作原理是这样的:

which works like this:

> nvcc -ptx -arch=sm_21 test_function.cu
test_function.cu

> nvcc -ptx -arch=sm_21 test_kernel.cu
test_kernel.cu

> javac -cp ".;jcuda-0.7.0a.jar" JCudaRuntimeTest.java
> java -cp ".;jcuda-0.7.0a.jar" JCudaRuntimeTest
Pointer: Pointer[nativePointer=0xa5a13a8,byteOffset=0]
CUBIN size: 5924
Function: CUfunction[nativePointer=0xa588160]

这里的关键似乎是使用 cuModuleLoadDataEx,注意返回值来自 cuLinkComplete 是指向链接的 CUBIN 的系统指针,图像的大小以 long[] 形式返回.根据 C++ 代码,指针只是直接传递给模块数据加载.

The key here seems to be to use cuModuleLoadDataEx, noting that the return values from cuLinkComplete are a system pointer to the linked CUBIN and the size of the image returned as a long[]. As per the C++ code, the pointer is just passed directly to the module data load.

作为最后的评论,如果您发布了一个可以直接被黑客攻击的适当的 repro 案例,而不是让我在创建有用的 repro 之前学习 JCUDA 和 Java 的基础知识,那会更简单和容易案例并让它发挥作用.JCUDA 的文档是基本的,但完整的,并且针对已经提供的工作 C++ 示例,只需阅读几分钟即可了解如何执行此操作.

As a final comment, it would have been much simpler and easier if you had posted a proper repro case that could be been directly hacked on, rather than making me learn the rudiments of JCUDA and Java before I could create a useful repro case and get it to work. The documentation for JCUDA is basic, but complete, and against the working C++ example already provided, it only took a couple of minutes of reading to see how to do this.

这篇关于JCuda中的JIT,加载多个ptx模块的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆