Java使用正确的Unicode字符将ISO-8859-1转换为UTF-8 [英] Java convert ISO-8859-1 to UTF-8 with correct unicode characters

查看:654
本文介绍了Java使用正确的Unicode字符将ISO-8859-1转换为UTF-8的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一些ISO-8859-1文本,我曾尝试将其转换为UTF-8,但最终出现了一些未正确映射的字符. 我一直在使用过多的标准内置Java字符集转换,这些转换几乎全部基于Charset.decode和内置CharsetDecoder.

I have some ISO-8859-1 text that I have tried to convert to UTF-8 but end up with some characters that are not mapped correctly. I have been using plethora of standard built-in Java charset conversion which are pretty much all based on Charset.decode and the built-in CharsetDecoder.

这导致两个问题:

  • I have some characters that look fine in ISO but crap in Java since I output in UTF-8 as do most java apps.
  • I cannot insert into MySQL even though its set to UTF-8

对于MySQL,我得到了异常(请参见上面的链接):

For MySQL I get the exception (see link above):

原因:java.sql.SQLException:错误的字符串值:第1行"b"列的'\ xC2 \ x9Esk \ xC3 \ xA9 ...'

Caused by: java.sql.SQLException: Incorrect string value: '\xC2\x9Esk\xC3\xA9...' for column 'b' at row 1

是否有Java iconv或比内置Java更好的字符解码器/映射器?

推荐答案

您确定您拥有ISO-8859-1吗?您可能有一些Win-1252,除了十几个字符外,它可能还很接近. \ x9E引起了我的怀疑.

Are you certain that you have ISO-8859-1? You might have some Win-1252, which can be sorta close except for a dozen or so characters. That \x9E raises that suspiscion with me.

尝试将您的源标记为WIN-1252,它将正确转换.

Try labeling your source as WIN-1252 and it should convert correctly.

这篇关于Java使用正确的Unicode字符将ISO-8859-1转换为UTF-8的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆