如何在 Java 中在 ISO-8859-1 和 UTF-8 之间进行转换? [英] How do I convert between ISO-8859-1 and UTF-8 in Java?

查看:61
本文介绍了如何在 Java 中在 ISO-8859-1 和 UTF-8 之间进行转换?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

有谁知道如何将字符串从 ISO-8859-1 转换为 UTF-8 并在 Java 中返回?

Does anyone know how to convert a string from ISO-8859-1 to UTF-8 and back in Java?

我从 Web 获取字符串并将其保存在 RMS (J2ME) 中,但我想保留特殊字符并从 RMS 获取字符串,但使用 ISO-8859-1 编码.我该怎么做?

I'm getting a string from the web and saving it in the RMS (J2ME), but I want to preserve the special chars and get the string from the RMS but with the ISO-8859-1 encoding. How do I do this?

推荐答案

一般来说,你不能这样做.UTF-8 能够对任何 Unicode 代码点进行编码.ISO-8859-1 只能处理其中的一小部分.所以,从 ISO-8859-1 转码到 UTF-8 是没有问题的.当发现不支持的字符时,从 UTF-8 倒退到 ISO-8859-1 将导致替换字符"(�) 出现在您的文本中.

In general, you can't do this. UTF-8 is capable of encoding any Unicode code point. ISO-8859-1 can handle only a tiny fraction of them. So, transcoding from ISO-8859-1 to UTF-8 is no problem. Going backwards from UTF-8 to ISO-8859-1 will cause "replacement characters" (�) to appear in your text when unsupported characters are found.

要转码文本:

byte[] latin1 = ...
byte[] utf8 = new String(latin1, "ISO-8859-1").getBytes("UTF-8");

byte[] utf8 = ...
byte[] latin1 = new String(utf8, "UTF-8").getBytes("ISO-8859-1");

您可以使用较低级别的 Charset API.例如,您可以在发现不可编码字符时引发异常,或者使用不同的字符替换文本.

You can exercise more control by using the lower-level Charset APIs. For example, you can raise an exception when an un-encodable character is found, or use a different character for replacement text.

这篇关于如何在 Java 中在 ISO-8859-1 和 UTF-8 之间进行转换?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆