java读写unicode / UTF-8文件名(不是内容) [英] java read write unicode / UTF-8 filenames (not contents)

查看:388
本文介绍了java读写unicode / UTF-8文件名(不是内容)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有几个日文字符的目录/文件。如果我尝试读取一个包含(例如)的文件名(而不是内容),我会收到一个包含 的字符串。如果我尝试创建一个包含ク的文件/目录,出现一个文件/目录,包含一个?。



例如:
我列出了。 / p>

 文件文件=新文件(。); 
String [] filesAndDirs = file.list();

filesAndDirs数组现在包含这些特殊字符的目录。字符串现在只包含 。它的接缝没有任何解码,因为一个getbytes对文件名中的每个字符只显示-17 -65 -67,即使对于不同的字符也是如此。



我使用MacOS 10.8.2 Java 7_10和Netbeans。



任何想法?



提前谢谢:)

解决方案

那些字节是0xef 0xbf 0xbd,您看到的\\\�字符的UTF-8编码形式,而不是日语字符。看起来,Java正在使用的任何操作系统功能来列出文件实际上是返回那些不正确的字符。



也许Files.newDirectoryStream将更可靠。尝试这样做:

  try(DirectoryStream< Path> dir = Files.newDirectoryStream(Paths.get(。))) {
for(Path child:dir){
String filename = child.getFileName()。toString();

System.out.println(name =+ filename);
for(char c:filename.toCharArray()){
System.out.printf(%04x,(int)c);
}
System.out.println();
}
}


i have a few directories/files with Japanese characters. If i try to read a filename (not the contents) containing (as example) a ク i receive a String containing a �. If i try to create a file/directory containing an ク a file/directory appears containing a ?.

As example: I list the files with.

File file = new File(".");  
String[] filesAndDirs = file.list();

the filesAndDirs array now contains the directories this the special characters. The String now only contains ����. It seams there is nothing to decode because the a getbytes shows only "-17 -65 -67" for every char in the filename even for different chars.

I use MacOS 10.8.2 Java 7_10 and Netbeans.

Any ideas?

Thank You in advance :)

解决方案

Those bytes are 0xef 0xbf 0xbd, which is the UTF-8-encoded form of the \ufffd character you're seeing instead of the Japanese characters. It appears whatever OS function Java is using to list the files is in fact returning those incorrect characters.

Perhaps Files.newDirectoryStream will be more reliable. Try this instead:

try (DirectoryStream<Path> dir = Files.newDirectoryStream(Paths.get("."))) {
    for (Path child : dir) {
        String filename = child.getFileName().toString();

        System.out.println("name=" + filename);
        for (char c : filename.toCharArray()) {
            System.out.printf("%04x ", (int) c);
        }
        System.out.println();
    }
}

这篇关于java读写unicode / UTF-8文件名(不是内容)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆