尽管终端配置正确,Java也无法显示Unicode字符 [英] Java can't display Unicode characters despite properly configured terminal
问题描述
我正在尝试打印 Unicode块字符在Cygwin中运行的Java应用程序中.尽管将终端设置为UTF-8,并且尽管Bash和Python能够打印字符,但Java仍仅打印?
.
I'm trying to print the Unicode block character in a Java application being run in Cygwin. Despite the terminal being set to UTF-8, and despite Bash and Python being able to print the character, Java simply prints a ?
.
$ echo $LANG
en_US.UTF-8
$ echo -e "\xe2\x96\x88"
█
$ python3 -c 'print("\u2588")'
█
$ cat Block.java
public class Block {
public static void main(String[] args) {
System.out.println('\u2588');
}
}
$ javac Block.java
$ java -cp . Block
?
这似乎是Cygwin特有的,因为从cmd运行时会显示该字符:
This appears to be Cygwin-specific, as when run from cmd the character is displayed:
>java -cp . Block
█
我可以做些什么来使Cygwin/mintty正确呈现Java输出吗?
Is there anything I can do to get Cygwin/mintty to render Java's output correctly?
更新:
Windows/Cygwin上的Java似乎并未实际使用 LANG
环境变量,因此实际上仍在使用cp1252:
It appears Java on Windows/Cygwin doesn't actually use the LANG
environment variable, and is therefore actually still using cp1252:
$ cat Block.java
public class Block {
public static void main(String[] args) {
System.out.println("Default Charset=" + java.nio.charset.Charset.defaultCharset());
System.out.println("\u2588");
}
}
$ java -cp . Block
Default Charset=windows-1252
?
但是奇怪的是,我无法让 iconv
正常工作:
But oddly I can't get iconv
to work:
$ java -cp . Block | iconv -f WINDOWS-1252 -t UTF8
Default Charset=windows-1252
?
推荐答案
据我所知,没有办法让 java
尊重Cygwin的字符集,因为Windows上的Java不使用任何字符集环境变量来确定默认编码.
As far as I can tell there's no way to get java
to respect Cygwin's charset, since Java on Windows doesn't use any environment variables to determine the default encoding.
您可以使用 JAVA_TOOL_OPTIONS
将标志动态添加到 java
调用中,但这会导致 java
打印调试信息,而我希望没有.
You can use JAVA_TOOL_OPTIONS
to add flags to the java
invocation dynamically, however this causes java
to print debugging information which I'd rather not have.
$ JAVA_TOOL_OPTIONS='-Dfile.encoding=UTF-8' java -cp . Block
Picked up JAVA_TOOL_OPTIONS: -Dfile.encoding=UTF-8
Default Charset=UTF-8
█
另一种选择是使用别名:
Another option is to use aliases:
alias javac='javac -encoding UTF-8'
alias java='java -Dfile.encoding=UTF-8'
对于互动使用而言,哪一种效果很好?
Which works well enough for interactive usage.
这篇关于尽管终端配置正确,Java也无法显示Unicode字符的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!