java中的文件名字符集问题 [英] File name charset problem in java

查看:284
本文介绍了java中的文件名字符集问题的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

当文件名有重音时,尝试打开一个文件,表示无法找到,因为字符集不匹配。
我在linux系统上使用UTF-8(/ etc / locales也设置UTF-8)。运行jboss与-Dfile.encoding = UTF-8和环境变量JBOSS_ENCODING =UTF-8

Trying to open a file it states it cannot be found, due to a charset mismatch, when file names have accents. I work using UTF-8 on a linux system (/etc/locales sets UTF-8 as well). Running jboss with -Dfile.encoding=UTF-8 and environment variable JBOSS_ENCODING="UTF-8"

使用JSP我得到该文件的名称: p>

With a JSP I am getting the name of the file :

String fileName = element.getChildText("FileName");
out.println("File to be opened : " + filename);

显示:

要打开的文件:aaaaaà.txt

File to be opened : aaaaaà.txt

但是,新的File(fileName)将无法正常工作。只是file.exists()是false。

But, a new File(fileName) won't work. Just file.exists() is false.

尝试:

File[] files = dir.listFiles();
for (int i=0; i<files.length; i++){
      out.println(fileName);

我得到:aaaaaÃ.txt

I get : aaaaaà .txt

为什么读取并尝试打开将文件作为ISO-8859-1的HDD文件?
是JBoss配置吗?一个java配置?如何强制java.io.File使用UTF-8作为文件名的字符集来读取文件?

Why is it reading and trying to open the file taking of the file in HDD as ISO-8859-1? Is it a JBoss config? A java config? How can I force java.io.File to read the file using the UTF-8 as the charset of the file name?

我使用其他工具和名称总是看起来很好,使用UTF-8。

I've used other tools and the name is always read fine, using UTF-8.

(注意我一直在谈论文件的名称,从来没有内容,它可能是一个void文件)

(note I'm always talking about the name of the file, never the content, it could be a void file)

推荐答案

我正在追踪问题。这是我已经拥有的:

I am trying to track down the problem. Here is what I already have:

Exists.java

import java.io.*;

public class Exists {
  public static void main(String[] args) {
    new File("aaa").exists();
    new File("aaa\u00E4").exists();
    new File("aaa\u00C3\u00A4").exists();
  }
}

还有 java - 版本

java version "1.6.0_20"
Java(TM) SE Runtime Environment (build 1.6.0_20-b02)
Java HotSpot(TM) 64-Bit Server VM (build 16.3-b01, mixed mode)

现在有趣的部分是:

$ strace -f -o strace.out java Exists && grep 'stat("aaa' strace.out
31942 stat("aaa", 0x41464950)           = -1 ENOENT (No such file or directory)
31942 stat("aaa\303\244", 0x41464950)   = -1 ENOENT (No such file or directory)
31942 stat("aaa\303\203\302\244", 0x41464950) = -1 ENOENT (No such file or directory)

好的是, strace 字节级,不是像Java这样的字符级,所以在这种情况下一切都OK,我有环境变量 LANG 设置为 en_US.UTF- 8 ,所有 LC _ * 变量未设置。

The nice thing is that strace works on byte-level, not character-level like Java. So everything is ok in this case. I have the environment variable LANG set to en_US.UTF-8, all of the LC_* variables are unset.

现在跟踪问题到最小的工作示例:

Now tracking down the problem to a minimal working example:

$ strace -f -o strace.out env - LC_ALL=en_US.UTF-8 /home/roland/bin/java Exists && grep 'stat("aaa' strace.out
31968 stat("aaa", 0x41a75950)           = -1 ENOENT (No such file or directory)
31968 stat("aaa\303\244", 0x41a75950)   = -1 ENOENT (No such file or directory)
31968 stat("aaa\303\203\302\244", 0x41a75950) = -1 ENOENT (No such file or directory)

仍然可以使用。所以让我们试试另一个编码:

That still works. So let's try another encoding:

$ strace -f -o strace.out env - LANG=en_US.ISO-8859-1 /home/roland/bin/java Exists && grep 'stat("aaa' strace.out
32070 stat("aaa", 0x407a3950)           = -1 ENOENT (No such file or directory)
32070 stat("aaa?", 0x407a3950)          = -1 ENOENT (No such file or directory)
32070 stat("aaa??", 0x407a3950)         = -1 ENOENT (No such file or directory)

所以这不行,一个可能的原因可能是我选择了一个不在 locale -a 但这不应该是Java将字母转换为问号的原因。

So this doesn't work. One possible reason might be that I selected a locale that is not in the list printed by locale -a. But this shouldn't be the reason for Java to convert the letters to question marks.

一旦LANG指向一个不存在的区域设置, sun.jnu.encoding 属性的设置没有任何影响,所以我现在没有想法。

As soon as LANG points to a non-existing locale, the setting of the sun.jnu.encoding property doesn't have any effect anymore. So I'm out of ideas now.

这篇关于java中的文件名字符集问题的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆