在Java中编码文件名 [英] Encoding of file names in Java

查看:477
本文介绍了在Java中编码文件名的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在嵌入式Linux平台上运行一个小型Java应用程序。用OpenJDK替换Java VM JamVM后,不能正确存储具有特殊字符的文件名。特殊字符如变音符会被问号替换。



这是我的测试代码:

 code> import java.io.File; 
import java.io.IOException;

public class FilenameEncoding
{

public static void main(String [] args){
String name =umlaute-äöü;
System.out.println(\\\
name =+ name);
System.out.print(以字节为单位的名称);
for(byte b:name.getBytes()){
System.out.print(Integer.toHexString(b& 255)+);
}
System.out.println();

try {
File f = new File(name);
f.createNewFile();
} catch(IOException e){
e.printStackTrace();
}
}

}

运行它给出以下输出:

  name = umlaute  -  ??? 
以字节为单位:75 6d 6c 61 75 74 65 2d 3f 3f 3f

和文件叫umlaute - ???创建。



将属性file.encoding和sun.jnu.encoding设置为UTF-8,在终端中提供正确的字符串,但创建的文件仍然是umlaute-



使用strace运行虚拟机,我可以看到系统调用

  open(umlaute  -  ???,O_RDWR | O_CREAT | O_EXCL | O_LARGEFILE,0666)= 4 

这表明问题不是文件系统问题,而是其中一个VM。



如何设置文件名的编码?

解决方案

如果您使用Eclipse,那么可以转到Window-> Preferences-> General-> Workspace,然后选择文本文件编码选项,您需要从下拉菜单。通过改变我的周围,我能够重新创建你的问题(也改回修复)。



如果你不是,那么你可以添加一个环境变量窗口(系统属性 - >环境变量和您要选择的系统变量New ...)名称应为(不带引号) JAVA_TOOL_OPTIONS ,该值应设置为 -Dfile.encoding = UTF8 (或任何编码将使你的工作。



我通过这个发现了答案post,btw:
设置默认Java字符编码 / p>

Linux解决方案



- (永久)使用 env终端中的| grep LANG 将给您一个或两个响应,回显当前使用的linux编码,然后您可以将LANG设置为UTF8(您可能设置为ASCII)在/ etc / sysconfig i18n文件(我测试了这个在2.6.40 fedora上)。勉强地,我从UTF8(我有奇怪的字符)切换到ASCII(我有问号),然后返回。



- (运行JVM时,可能不会解决问题)您可以使用所需的编码启动JVM使用java -Dfile.encoding = **** FilenameEncoding
这是两种方式的输出:

  [youssef @ JoeLaptop bin] $ java -Dfile.encoding = UTF8 FilenameEncoding 

name = umlaute-הצ
以字节为单位的名称: 75 6d 6c 61 75 74 65 2d d7 94 d7 a6 ef bf bd
UTF-8
UTF8

[youssef @ JoeLaptop bin] $ java FilenameEncoding

name = umlaute - ?????
以字节为单位:75 6d 6c 61 75 74 65 2d 3f 3f 3f 3f 3f 3f 3f
US-ASCII
ASCII

这是linux的一些参考资料
http://www.cyberciti.biz/faq/set-environment-variable-linux/



此处是关于-Dfile.encoding
设置默认Java字符编码的一个? / a>


I am running a small Java application on an embedded Linux platform. After replacing the Java VM JamVM with OpenJDK, file names with special characters are not stored correctly. Special characters like umlauts are replaced by question marks.

Here is my test code:

import java.io.File;
import java.io.IOException;

public class FilenameEncoding
{

        public static void main (String[] args) {
                String name = "umlaute-äöü";
                System.out.println("\nname = " + name);
                System.out.print("name in Bytes: ");
                for (byte b : name.getBytes()) {
                        System.out.print(Integer.toHexString(b & 255) + " ");
                }
                System.out.println();

                try {
                        File f = new File(name);
                        f.createNewFile();
                } catch (IOException e) {
                        e.printStackTrace();
                }
        }

}

Running it gives the following output:

name = umlaute-???
name in Bytes: 75 6d 6c 61 75 74 65 2d 3f 3f 3f

and file called umlaute-??? is created.

Setting the properties file.encoding and sun.jnu.encoding to UTF-8 gives the correct strings in the terminal, but the created file is still umlaute-???

Running the VM with strace, I can see the system call

open("umlaute-???", O_RDWR|O_CREAT|O_EXCL|O_LARGEFILE, 0666) = 4

This shows, that the problem is not a file system issue, but one of the VM.

How can the encoding of the file name be set?

解决方案

If you are using Eclipse, then you can go to Window->Preferences->General->Workspace and select the "Text file encoding" option you want from the pull down menu. By changing mine around, I was able to recreate your problem (and also change back to the fix).

If you are not, then you can add an environmental variable to windows (System properties->Environment Variables and under system variables you want to select New...) The name should be (without quotes) JAVA_TOOL_OPTIONS and the value should be set to -Dfile.encoding=UTF8 (or whatever encoding will get yours to work.

I found the answer through this post, btw: Setting the default Java character encoding?

Linux Solutions

-(Permanent) Using env | grep LANG in the terminal will give you one or two responses back on what encoding linux is currently setup with. You can then set LANG to UTF8 (yours might be set to ASCII) in the /etc/sysconfig i18n file (I tested this on 2.6.40 fedora). Bascially, I switched from UTF8 (where I had odd characters) to ASCII (where I had question marks) and back.

-(on running the JVM, but may not fix the problem) You can start the JVM with the encoding you want using java -Dfile.encoding=**** FilenameEncoding Here is the output from the two ways:

[youssef@JoeLaptop bin]$ java -Dfile.encoding=UTF8 FilenameEncoding

name = umlaute-הצ�
name in Bytes: 75 6d 6c 61 75 74 65 2d d7 94 d7 a6 ef bf bd 
UTF-8
UTF8

[youssef@JoeLaptop bin]$ java FilenameEncoding

name = umlaute-???????
name in Bytes: 75 6d 6c 61 75 74 65 2d 3f 3f 3f 3f 3f 3f 3f 
US-ASCII
ASCII

Here is some references for the linux stuff http://www.cyberciti.biz/faq/set-environment-variable-linux/

and here is one about the -Dfile.encoding Setting the default Java character encoding?

这篇关于在Java中编码文件名的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆