如何在 Java 中在 ISO-8859-1 和 UTF-8 之间进行转换? [英] How do I convert between ISO-8859-1 and UTF-8 in Java?
问题描述
有谁知道如何将字符串从 ISO-8859-1 转换为 UTF-8 并在 Java 中返回?
Does anyone know how to convert a string from ISO-8859-1 to UTF-8 and back in Java?
我从 Web 获取字符串并将其保存在 RMS (J2ME) 中,但我想保留特殊字符并从 RMS 获取字符串,但使用 ISO-8859-1 编码.我该怎么做?
I'm getting a string from the web and saving it in the RMS (J2ME), but I want to preserve the special chars and get the string from the RMS but with the ISO-8859-1 encoding. How do I do this?
推荐答案
一般来说,你不能这样做.UTF-8 能够对任何 Unicode 代码点进行编码.ISO-8859-1 只能处理其中的一小部分.所以,从 ISO-8859-1 转码到 UTF-8 是没有问题的.当发现不支持的字符时,从 UTF-8 倒退到 ISO-8859-1 将导致替换字符"(�) 出现在您的文本中.
In general, you can't do this. UTF-8 is capable of encoding any Unicode code point. ISO-8859-1 can handle only a tiny fraction of them. So, transcoding from ISO-8859-1 to UTF-8 is no problem. Going backwards from UTF-8 to ISO-8859-1 will cause "replacement characters" (�) to appear in your text when unsupported characters are found.
要转码文本:
byte[] latin1 = ...
byte[] utf8 = new String(latin1, "ISO-8859-1").getBytes("UTF-8");
或
byte[] utf8 = ...
byte[] latin1 = new String(utf8, "UTF-8").getBytes("ISO-8859-1");
您可以使用较低级别的 Charset
API.例如,您可以在发现不可编码字符时引发异常,或者使用不同的字符替换文本.
You can exercise more control by using the lower-level Charset
APIs. For example, you can raise an exception when an un-encodable character is found, or use a different character for replacement text.
这篇关于如何在 Java 中在 ISO-8859-1 和 UTF-8 之间进行转换?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!