Linux上的Java Charset问题 [英] Java Charset problem on linux
问题描述
但是,如果我在带有选项-Dfile.encoding = ISO-8859-1的Linux上运行,它将正常工作..
如何使用UTF-8默认字符集而不是在unix环境中设置-D选项来使其工作.
我使用jdk1.6.13
edit:代码段与cs ="ISO-8859-1"一起使用;或cs ="UTF-8";赢了,但在Linux中没有
字符串x =½";System.out.println(x);byte [] ba = x.getBytes(Charset.forName(cs));为(字节b:ba){System.out.println(b);}字符串y =新字符串(ba,Charset.forName(cs));System.out.println(y);
〜致爸爸
您的字符可能已被编译过程破坏,并且最终在类文件中出现了垃圾数据.
如果我在Linux上使用-Dfile.encoding = ISO-8859-1选项运行,它将正常工作..
简而言之,不要使用-Dfile.encoding = ...
字符串x =½";
由于U + 00bd(½)将以不同的编码表示为不同的值:
windows-1252 BDUTF-8 C2 BDISO-8859-1 BD
...您需要告诉编译器源文件的编码方式为:
javac-编码ISO-8859-1 Foo.java
现在我们来看看这个:
System.out.println(x);
作为 PrintStream ,在发出字节数据之前将数据编码为系统编码.像这样:
System.out.write(x.getBytes(Charset.defaultCharset()));
在某些平台-字节编码必须与控制台期望的字符正确显示的编码相同.
problem: I have a string containing special characters which i convert to bytes and vice versa..the conversion works properly on windows but on linux the special character is not converted properly.the default charset on linux is UTF-8 as seen with Charset.defaultCharset.getdisplayName()
however if i run on linux with option -Dfile.encoding=ISO-8859-1 it works properly..
how to make it work using the UTF-8 default charset and not setting the -D option in unix environment.
edit: i use jdk1.6.13
edit:code snippet works with cs = "ISO-8859-1"; or cs="UTF-8"; on win but not in linux
String x = "½";
System.out.println(x);
byte[] ba = x.getBytes(Charset.forName(cs));
for (byte b : ba) {
System.out.println(b);
}
String y = new String(ba, Charset.forName(cs));
System.out.println(y);
~regards daed
Your characters are probably being corrupted by the compilation process and you're ending up with junk data in your class file.
if i run on linux with option -Dfile.encoding=ISO-8859-1 it works properly..
In short, don't use -Dfile.encoding=...
String x = "½";
Since U+00bd (½) will be represented by different values in different encodings:
windows-1252 BD
UTF-8 C2 BD
ISO-8859-1 BD
...you need to tell your compiler what encoding your source file is encoded as:
javac -encoding ISO-8859-1 Foo.java
Now we get to this one:
System.out.println(x);
As a PrintStream, this will encode data to the system encoding prior to emitting the byte data. Like this:
System.out.write(x.getBytes(Charset.defaultCharset()));
That may or may not work as you expect on some platforms - the byte encoding must match the encoding the console is expecting for the characters to show up correctly.
这篇关于Linux上的Java Charset问题的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!