如何在Java中的ISO-8859-1和UTF-8之间进行转换? [英] How do I convert between ISO-8859-1 and UTF-8 in Java?

查看:176
本文介绍了如何在Java中的ISO-8859-1和UTF-8之间进行转换?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

有人知道如何将一个字符串从ISO-8859-1转换为UTF-8并返回Java?

Does anyone know how to convert a string from ISO-8859-1 to UTF-8 and back in Java?

我从网上得到一个字符串并将其保存在RMS(J2ME)中,但是我想保留特殊字符,并从RMS获取字符串,但使用ISO-8859-1编码。如何做到这一点?

I'm getting a string from the web and saving it in the RMS (J2ME), but I want to preserve the special chars and get the string from the RMS but with the ISO-8859-1 encoding. How do I do this?

推荐答案

一般来说,你不能这样做。 UTF-8能够编码任何Unicode代码点。 ISO-8859-1只能处理其中的一小部分。因此,从ISO-8859-1到UTF-8的转码没有问题。从UTF-8向后退到ISO-8859-1将导致替换字符(�)出现在不支持的字符时出现在文本中。

In general, you can't do this. UTF-8 is capable of encoding any Unicode code point. ISO-8859-1 can handle only a tiny fraction of them. So, transcoding from ISO-8859-1 to UTF-8 is no problem. Going backwards from UTF-8 to ISO-8859-1 will cause "replacement characters" (�) to appear in your text when unsupported characters are found.

要转码文本:

byte[] latin1 = ...
byte[] utf8 = new String(latin1, "ISO-8859-1").getBytes("UTF-8");

byte[] utf8 = ...
byte[] latin1 = new String(utf8, "UTF-8").getBytes("ISO-8859-1");

您可以使用较低级别的 Charset 蜜蜂。例如,当发现不可编码的字符时,您可以引发异常,或者对替换文本使用不同的字符。

You can exercise more control by using the lower-level Charset APIs. For example, you can raise an exception when an un-encodable character is found, or use a different character for replacement text.

这篇关于如何在Java中的ISO-8859-1和UTF-8之间进行转换?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆