尽管终端配置正确,Java也无法显示Unicode字符 [英] Java can't display Unicode characters despite properly configured terminal

查看:72
本文介绍了尽管终端配置正确,Java也无法显示Unicode字符的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试打印 Unicode块字符在Cygwin中运行的Java应用程序中.尽管将终端设置为UTF-8,并且尽管Bash和Python能够打印字符,但Java仍仅打印?.

I'm trying to print the Unicode block character in a Java application being run in Cygwin. Despite the terminal being set to UTF-8, and despite Bash and Python being able to print the character, Java simply prints a ?.

$ echo $LANG
en_US.UTF-8

$ echo -e "\xe2\x96\x88"
█

$ python3 -c 'print("\u2588")'
█

$ cat Block.java
public class Block {
  public static void main(String[] args) {
    System.out.println('\u2588');
  }
}

$ javac Block.java

$ java -cp . Block
?

这似乎是Cygwin特有的,因为从cmd运行时会显示该字符:

This appears to be Cygwin-specific, as when run from cmd the character is displayed:

>java -cp . Block
█

我可以做些什么来使Cygwin/mintty正确呈现Java输出吗?

Is there anything I can do to get Cygwin/mintty to render Java's output correctly?

更新:

Windows/Cygwin上的Java似乎并未实际使用 LANG 环境变量,因此实际上仍在使用cp1252:

It appears Java on Windows/Cygwin doesn't actually use the LANG environment variable, and is therefore actually still using cp1252:

$ cat Block.java
public class Block {
  public static void main(String[] args) {
    System.out.println("Default Charset=" + java.nio.charset.Charset.defaultCharset());
    System.out.println("\u2588");
  }
}

$ java -cp . Block
Default Charset=windows-1252
?

但是奇怪的是,我无法让 iconv 正常工作:

But oddly I can't get iconv to work:

$ java -cp . Block | iconv -f WINDOWS-1252 -t UTF8
Default Charset=windows-1252
?

推荐答案

据我所知,没有办法让 java 尊重Cygwin的字符集,因为Windows上的Java不使用任何字符集环境变量来确定默认编码.

As far as I can tell there's no way to get java to respect Cygwin's charset, since Java on Windows doesn't use any environment variables to determine the default encoding.

您可以使用 JAVA_TOOL_OPTIONS 将标志动态添加到 java 调用中,但这会导致 java 打印调试信息,而我希望没有.

You can use JAVA_TOOL_OPTIONS to add flags to the java invocation dynamically, however this causes java to print debugging information which I'd rather not have.

$ JAVA_TOOL_OPTIONS='-Dfile.encoding=UTF-8' java -cp . Block
Picked up JAVA_TOOL_OPTIONS: -Dfile.encoding=UTF-8
Default Charset=UTF-8
█

另一种选择是使用别名:

Another option is to use aliases:

alias javac='javac -encoding UTF-8'
alias java='java -Dfile.encoding=UTF-8'

对于互动使用而言,哪一种效果很好?

Which works well enough for interactive usage.

这篇关于尽管终端配置正确,Java也无法显示Unicode字符的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆