JNI-在Java和本机代码之间传递大量数据 [英] JNI - Passing large amounts of data between Java and Native code

查看:97
本文介绍了JNI-在Java和本机代码之间传递大量数据的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在努力实现以下目标:

I am trying to achieve the following:

1)我在Java端有一个字节数组,代表一个图像.

1) I have a byte array on the java side that represents an image.

2)我需要授予我的本机代码访问权限.

2) I need to give my native code access to it.

3)本机代码使用GraphicsMagick对该图像进行解码,并通过调用resize创建一堆缩略图.它还可以计算图像的感知哈希值,该哈希值可以是矢量,也可以是unint8_t数组.

3) The native code decodes this image using GraphicsMagick and creates a bunch of thumbnails by calling resize. It also calculates a perceptual hash of the image which is either a vector or a unint8_t array.

4)一旦我将这些数据返回到Java端,不同的线程将读取它.缩略图将通过HTTP上传到某些外部存储服务.

4) Once I return this data back to the Java side different threads will read it. The thumbnails will be uploaded to some external storage service via HTTP.

我的问题是:

1)将字节从Java传递到本机代码的最有效方法是什么?我可以将其作为字节数组访问.我看不出将其作为字节缓冲区(包装此字节数组)与此处的字节数组传递任何特殊优势.

1) What would be the most efficient way to pass the bytes from Java to my native code? I have access to it as a byte array. I don't see any particular advantage to passing it as a byte buffer (wrapping this byte array) vs a byte array here.

2)将这些缩略图和感知哈希返回给Java代码的最佳方法是什么?我想到了一些选择:

2) What would be the best way to return these thumbnails and perceptual hash back to the java code? I thought of a few options:

(i)我可以在Java中分配一个字节缓冲区,然后将其传递给我的本机方法.然后,本机方法可以对其进行写入并在完成后设置限制,并返回写入的字节数或表示成功的布尔值.然后,我可以对字节缓冲区进行切片和切块,以提取不同的缩略图和感知哈希,并将其传递给将上传缩略图的不同线程.这种方法的问题是我不知道要分配多少大小.所需的大小取决于生成的缩略图的大小(我事先不知道)和缩略图的数量(我确实知道).

(i) I could allocate a byte buffer in Java and then pass it along to my native method. The native method could then write to it and set a limit after it is done and return the number of bytes written or some boolean indicating success. I could then slice and dice the byte buffer to extract the distinct thumbnails and perceptual hash and pass it along to the different threads that will upload the thumbnails. The problem with this approach is I don't know what size to allocate. The needed size will depend on the size of the thumbnails generated which I don't know in advance and the number of thumbnails (I do know this in advance).

(ii)一旦知道所需的大小,我也可以在本机代码中分配字节缓冲区.我可以根据我的自定义打包协议将我的blob转移到正确的区域,然后返回此字节缓冲区. (i)和(ii)都看起来很复杂,因为自定义打包协议必须指示每个缩略图的长度和感知哈希.

(ii) I could also allocate the byte buffer in native code once I know the size needed. I could memcpy my blobs to the right region based on my custom packing protocol and return this byte buffer. Both (i) and (ii) seem complicated because of the custom packing protocol that would have to indicate the the length of each thumbnail and the perceptual hash.

(iii)定义一个Java类,该类具有以下字段:缩略图:字节缓冲区数组和感知哈希:字节数组.当我知道所需的确切大小时,可以在本机代码中分配字节缓冲区.然后,我可以将GraphicsMagick Blob中的字节存储到每个字节缓冲区的直接地址.我假设还有某种方法可以设置写入字节缓冲区的字节数,以便Java代码知道字节缓冲区的大小.设置字节缓冲区后,我可以填写我的Java对象并返回它.与(i)和(ii)相比,我在这里创建了更多的字节缓冲区以及Java对象,但是避免了自定义协议的复杂性. (i),(ii)和(iii)的基本原理-鉴于我对这些缩略图所做的唯一事情就是上传它们,所以我希望通过NIO上传它们时,使用字节缓冲区(与字节数组)保存额外的副本

(iii) Define a Java class that has fields for thumbnails: array of byte buffers and perceptual hash: byte array. I could allocate the byte buffers in native code when I know the exact sizes needed. I can then memcpy the bytes from my GraphicsMagick blob to the direct address of each byte buffer. I am assuming that there is also some method to set the number of bytes written on the byte buffer so that the java code knows how big the byte buffers are. After the byte buffers are set, I could fill in my Java object and return it. Compared to (i) and (ii) I create more byte buffers here and also a Java object but I avoid the complexity of a custom protocol. Rationale behind (i), (ii) and (iii) - given that the only thing I do with these thumbnails is to upload them, I was hoping to save an extra copy with byte buffers (vs byte array) when uploading them via NIO.

(iv)定义一个Java类,该类具有用于缩略图的字节数组(而不是字节缓冲区)和用于感知哈希的字节数组的数组.我用本机代码创建了这些Java数组,并使用SetByteArrayRegion从GraphicsMagick Blob复制了字节.与以前的方法相比,缺点在于,当将此字节数组从堆复制到某个直接缓冲区上载时,现在在Java领域中将有另一个副本.也不知道我是否会在这里相对于(iii)节省任何东西.

(iv) Define a Java class that has an array of byte arrays (instead of byte buffers) for the thumbnails and a byte array for the perceptual hash. I create these Java arrays in my native code and copy over the bytes from my GraphicsMagick blob using SetByteArrayRegion. The disadvantage vs the previous methods is that now there will be yet another copy in Java land when copying this byte array from the heap to some direct buffer when uploading it. Not sure that I would be saving any thing in terms of complexity vs (iii) here either.

任何建议都会很棒.

@main建议一个有趣的解决方案.我正在编辑我的问题,以跟进该选项.如果我想像@main一样将本机内存包装在DirectBuffer中,我怎么知道何时可以安全地释放本机内存?

@main suggested an interesting solution. I am editing my question to follow up on that option. If I wanted to wrap native memory in a DirectBuffer like how @main suggests, how would I know when I can safely free the native memory?

推荐答案

将字节从Java传递到本机代码的最有效方法是什么?我可以将其作为字节数组访问.我看不出将其作为字节缓冲区(包装此字节数组)与此处的字节数组传递任何特殊优势.

What would be the most efficient way to pass the bytes from Java to my native code? I have access to it as a byte array. I don't see any particular advantage to passing it as a byte buffer (wrapping this byte array) vs a byte array here.

直接ByteBuffer的最大优点是您可以调用

The big advantage of a direct ByteBuffer is that you can call GetDirectByteBufferAddress on the native side and you immediately have a pointer to the buffer contents, without any overhead. If you pass a byte array, you have to use GetByteArrayElements and ReleaseByteArrayElements (they might copy the array) or the critical versions (they pause the GC). So using a direct ByteBuffer can have a positive impact on your code's performance.

正如您所说,(i)将不起作用,因为您不知道该方法将返回多少数据. (ii)由于该定制包装协议而过于复杂.我将使用(iii)的修改版本:您不需要该对象,您可以返回一个ByteBuffer数组,其中第一个元素是哈希,其他元素是缩略图.您可以丢弃所有memcpy !这就是直接ByteBuffer的全部要点:避免复制.

As you said, (i) won't work because you don't know how much data the method is going to return. (ii) is too complex because of that custom packaging protocol. I would go for a modified version of (iii): You don't need that object, you can just return an array of ByteBuffers where the first element is the hash and the other elements are the thumbnails. And you can throw away all the memcpys! That's the entire point in a direct ByteBuffer: Avoiding copying.

代码:

void Java_MyClass_createThumbnails(JNIEnv* env, jobject, jobject input, jobjectArray output)
{
    jsize nThumbnails = env->GetArrayLength(output) - 1;
    void* inputPtr = env->GetDirectBufferAddress(input);
    jlong inputLength = env->GetDirectBufferCapacity(input);

    // ...

    void* hash = ...; // a pointer to the hash data
    int hashDataLength = ...;
    void** thumbnails = ...; // an array of pointers, each one points to thumbnail data
    int* thumbnailDataLengths = ...; // an array of ints, each one is the length of the thumbnail data with the same index

    jobject hashBuffer = env->NewDirectByteBuffer(hash, hashDataLength);
    env->SetObjectArrayElement(output, 0, hashBuffer);

    for (int i = 0; i < nThumbnails; i++)
        env->SetObjectArrayElement(output, i + 1, env->NewDirectByteBuffer(thumbnails[i], thumbnailDataLengths[i]));
}

我只有一个字节数组可用于输入.不会将字节数组包装在字节缓冲区中仍然产生相同的负担吗?我也对数组使用了以下语法: http://developer.android.com/training/articles/perf-jni.html#region_calls .虽然仍然可以复制.

I only have a byte array available to me for the input. Wouldn't wrapping the byte array in a byte buffer still incur the same tax? I also so this syntax for arrays: http://developer.android.com/training/articles/perf-jni.html#region_calls. Though a copy is still possible.

GetByteArrayRegion始终写入缓冲区,因此每次都会创建一个副本,因此建议使用GetByteArrayElements.将数组复制到Java端的直接ByteBuffer上也不是最好的主意,因为您仍然拥有该副本,如果GetByteArrayElements固定该数组,最终可以避免使用该副本.

GetByteArrayRegion always write to a buffer, therefore creating a copy every time, so I would suggest GetByteArrayElements instead. Copying the array to a direct ByteBuffer on the Java side is also not the best idea because you still have that copy that you could eventually avoid if GetByteArrayElements pins the array.

如果我创建包装本地数据的字节缓冲区,谁负责清理它?我之所以做memcpy,只是因为我认为Java不知道何时释放它.该内存可能在堆栈上,在堆上或在某些自定义分配器上,这似乎会导致错误.

If I create byte buffers that wrap native data, who is responsible for cleaning it up? I did the memcpy only because I thought Java would have no idea when to free this. This memory could be on the stack, on the heap or from some custom allocator, which seems like it would cause bugs.

如果数据在堆栈中,则 必须 将其复制到Java数组中,该Java数组是用Java代码或堆中某个位置创建的直接ByteBuffer (以及指向该位置的直接ByteBuffer).如果它在堆上,那么只要可以确保没有人释放内存,就可以安全地使用通过NewDirectByteBuffer创建的直接ByteBuffer.释放堆内存后,您必须不再使用ByteBuffer对象.将使用NewDirectByteBuffer创建的直接ByteBuffer进行GC替换后,Java不会尝试删除本机内存.您必须手动进行处理,因为您还手动创建了缓冲区.

If the data is on the stack, then you must copy it into Java array, a direct ByteBuffer that was created in Java code or somewhere on the heap (and a direct ByteBuffer that points to that location). If it's on the heap, then you can safely use that direct ByteBuffer that you created using NewDirectByteBuffer as long as you can ensure that nobody frees the memory. When the heap memory is free'd, you must no longer use the ByteBuffer object. Java does not try to remove the native memory when a direct ByteBuffer that was created using NewDirectByteBuffer is GC'd. You have to take care of that manually, because you also created the buffer manually.

这篇关于JNI-在Java和本机代码之间传递大量数据的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆