Java JNI:将多字节字符从Java传递到C [英] Java JNI: Passing multibyte characters from java to c

查看:124
本文介绍了Java JNI:将多字节字符从Java传递到C的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我再次弄乱了java natve接口,并且遇到了另一个有趣的问题.我通过jni将文件路径发送到c,然后执行一些I/O.因此,我遇到的最常见字符是'äåö'.这是一个完全相同问题的程序的简短演示:

I'm once again messing around with the java natve interface, and I've runned into another interesting problem. I'm sending a filepath to c via jni and then doing some I/O. So the most common chars I have troubles with is 'äåö'. Here is a short demo of a program with the exact same problem:

Java:

public class java {

  private static native void printBytes(String text);
  static{
    System.loadLibrary("dll");
  }

  public static void main(String[] args){
    printBytes("C:/Users/ä-å-ö/Documents/Bla.txt");
  }
}

C:

#include "java.h"
#include <jni.h>

JNIEXPORT void JNICALL Java_java_printBytes(JNIEnv *env, jclass class, jstring text){
  const jbyte* text_input = (*env)->GetStringUTFChars(env, text, 0);
  jsize size = (*env)->GetStringUTFLength(env, text);
  int i = 0;
  printf("%s\n",text_input);
  (*env)->ReleaseStringUTFChars(env, text, text_input);
}

输出: C:/用户/├ñ-├Ñ-├Â/Documents/Bla.txt

Output: C:/Users/├ñ-├Ñ-├Â/Documents/Bla.txt

这不是我想要的结果,我希望它输出与Java中相同的字符串.

This is NOT my desired result, I would like it to output the same string as in java.

推荐答案

您正在处理平台特定的字符编码问题.尽管标准c printf应该能够处理多字节(utf-8)编码的字符串,但windows/msvc所提供的只是标准以外的任何东西,不能.在非Windows标准的平台上,您的代码将可以正常工作.来自Java的字符串使用UTF-8(多字节字符),MS printf要求使用ASCII(每个字符单字节).这适用于ASCII字符,因为在UTF-8中这些字符具有相同的值.它不适用于ASCII以外的字符.

You are dealing with platform specific character encoding issues. Although the standard c printf should be able to handle multibyte (utf-8) encoded strings the windows/msvc provided one is anything but standard and cannot. On a non-windows standard conforming platform would expect your code would work. The string coming from java is in UTF-8 (multibyte char) and the MS printf is expecting a ASCII (single byte per char). This is working for ASCII characters because in UTF-8 those characters have the same value. It does not work for characters outside of ASCII.

基本上,您需要将字符串转换为宽字符(text.getBytes(Charset.forName(UTF-16LE")))并将其作为数组从java传递到c,或者在接收到多字节字符串后将其转换为c中的宽字符(MultiByteToWideChar(CP_UTF8, ...)).然后,您可以使用printf(%S")或wprintf(%s")进行输出.

Basically you need to either convert your string to wide characters (text.getBytes(Charset.forName(UTF-16LE"))) and pass it as an array from java to c or convert the multibyte string to wide characters in c after receiving it (MultiByteToWideChar(CP_UTF8, ...)). Then you can use printf("%S") or wprintf("%s") to output it.

请参见使用以下命令打印UTF-8字符串printf-宽与多字节字符串文字,以获取更多信息.另外请注意,答案说,如果要在Windows控制台上输出unicode,必须使用_setmode设置unicode输出模式.

See Printing UTF-8 strings with printf - wide vs. multibyte string literals for more information. Also note that the answer says you have to set unicode output mode with _setmode if you want unicode output on the windows console.

还请注意,我不相信GetStringUTFLength可以保证NUL终止符,但已经太久了.

Also note that I don't believe GetStringUTFLength guarantees a NUL terminator but it's been too long.

这篇关于Java JNI:将多字节字符从Java传递到C的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆