为什么在Windows和Linux中显示不同,即使使用UTF-8? [英] Why is ¿ displayed different in Windows vs Linux even when using UTF-8?

查看:555
本文介绍了为什么在Windows和Linux中显示不同,即使使用UTF-8?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

为什么在Linux和Windows中显示不同?

  System.out.println(new String(¿ .getBytes(UTF-8),UTF-8)); 



p>

在Linux中:



¬

方案

System.out.println()输出系统默认编码中的文本,但是控制台根据自己的编码(或代码页)设置解释该输出。在Windows机器上,两种编码似乎匹配,但在Linux机器上,输出显然是UTF-8,而控制台将其解码为单字节编码,如ISO-8859-1。或者,正如Jon建议的那样,源文件被保存为UTF-8,而 javac 正在读取其他内容,这是一个可以通过使用Unicode转义来避免的问题。



当您需要输出ASCII文本以外的任何内容时,最好的办法是使用适当的编码将其写入文件,然后使用文本编辑器读取文件 - - 小圆太有限,太系统依赖。顺便说一句,这一段代码:

  new String(¿.getBytes(UTF-8), UTF-8)

...对输出没有影响。所有这一切都是将字符串的内容编码为字节数组,并再次解码,再现原始字符串 - 昂贵的无操作。如果要以特定编码输出文本,则需要使用OutputStreamWriter,如下所示:

  FileOutputStream fos = new FileOutputStream (out.txt); 
OutputStreamWriter osw = new OutputStreamWriter(fos,UTF-8);


Why is the following displayed different in Linux vs Windows?

System.out.println(new String("¿".getBytes("UTF-8"), "UTF-8"));

in Windows:

¿

in Linux:

¿

解决方案

System.out.println() outputs the text in the system default encoding, but the console interprets that output according to its own encoding (or "codepage") setting. On your Windows machine the two encodings seem to match, but on the Linux box the output is apparently in UTF-8 while the console is decoding it as a single-byte encoding like ISO-8859-1. Or maybe, as Jon suggested, the source file is being saved as UTF-8 and javac is reading it as something else, a problem that can be avoided by using Unicode escapes.

When you need to output anything other than ASCII text, your best bet is to write it to a file using an appropriate encoding, then read the file with a text editor--consoles are too limited and too system-dependent. By the way, this bit of code:

new String("¿".getBytes("UTF-8"), "UTF-8")

...has no effect on the output. All that does is encode the contents of the string to a byte array and decode it again, reproducing the original string--an expensive no-op. If you want to output text in a particular encoding, you need to use an OutputStreamWriter, like so:

FileOutputStream fos = new FileOutputStream("out.txt");
OutputStreamWriter osw = new OutputStreamWriter(fos, "UTF-8");

这篇关于为什么在Windows和Linux中显示不同,即使使用UTF-8?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆