Java charset和Windows [英] Java charset and Windows

查看:154
本文介绍了Java charset和Windows的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个Java程序在外部进程中运行msinfo32.exe(系统信息),然后读取msinfo32.exe生成的文件内容。当Java程序将文件内容加载到String中时,String字符是不可读的。要使String可读,我必须使用String(byte [] bytes,String charsetName)创建String,并将charsetName设置为UTF-16。但是,当在Windows2003的一个实例上运行时,只有UTF-16LE(小端)会产生可打印的字符串。

I have a Java program that runs msinfo32.exe (system information)in an external process and then reads the file content produced by msinfo32.exe. When the Java program loads the file content into a String, the String characters are unreadable. For the String to be readable I have to create the String using String(byte[] bytes, String charsetName) and set charsetName to UTF-16. However when running on one instance of Windows2003, only UTF-16LE (little endian) results in a printable string.

我怎样才能提前知道要使用哪个字符编码?

How can I know ahead of time which character encoding to use?

此外,任何有关此主题的背景信息都将受到赞赏。

Also, any background information on this topic would be appreciated.

推荐答案

某些Microsoft应用程序使用字节顺序标记来指示Unicode文件及其字节顺序。我可以在我的Windows XP机器上看到导出的.NFO文件以0xFFFE开头,所以它是little-endian。

Some Microsoft applications use a byte-order mark to indicate Unicode files and their endianness. I can see on my Windows XP machine that the exported .NFO file starts with 0xFFFE, so it is little-endian.

FF FE 3C 00 3F 00 78 00 6D 00 6C 00 20 00 76 00         __<_?_x_m_l_ _v_
65 00 72 00 73 00 69 00 6F 00 6E 00 3D 00 22 00         e_r_s_i_o_n_=_"_
31 00 2E 00 30 00 22 00 3F 00 3E 00 0D 00 0A 00         1_._0_"_?_>_____
3C 00 4D 00 73 00 49 00 6E 00 66 00 6F 00 3E 00         <_M_s_I_n_f_o_>_
0D 00 0A 00 3C 00 4D 00 65 00 74 00 61 00 64 00         ____<_M_e_t_a_d_

另外,我建议您切换到使用 Reader 实现而不是用于解码文件的String构造函数;这有助于避免读取半个字符的问题,因为它被截断,因为它位于字节数组的末尾。

Also, I recommend you switch to using Reader implementations rather than the String constructor for decoding files; this helps avoid problems where you read half a character because it is truncated because it is sitting at the end of a byte array.

这篇关于Java charset和Windows的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆