字符串编码 - Shift_JIS / UTF-8 [英] String encoding - Shift_JIS / UTF-8

查看:531
本文介绍了字符串编码 - Shift_JIS / UTF-8的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我从第三方库中获取了一个字符串,它没有很好地编码。
不幸的是我不允许改变库或使用另一个...



所以实际的问题是,第三方库结果字符串在UTF-8字符串中编码诸如èòàùìäöüü,...之类的字符作为SHIFT_JIS(汉字)。



例如:


Ö只是一个简单的测试



ÖJusta simple test p>

我试过下面的没有成功:

  byte [] b = resultString .getBytes(Shift_JIS); 
String value = new String(b,UTF-8);

更新1:



这是resultString的内容。



注意:
显示的字节数组没有任何修改(例如getBytes(Shift_JIS ),它只是以字节为单位的resultString)






你有什么想法吗?
任何帮助将非常感谢。
谢谢。

解决方案

很好,很奇怪:



As

  byte [] b = resultString.getBytes(Shift_JIS); 
String value = new String(b,UTF-8);

不适用于我,我尝试了以下操作:

  String value = new String(resultString.getBytes(SHIFT-JIS),UTF-8)
pre>

就像一个魅力。
也许是因为Shift_JIS中的下划线和小写字符。


I get a string from a 3rd party library, which is not well encoded. Unfortunately I'm not allowed to change the library or use another one...

So the actual problem is, that the 3rd party library result string will encode characters like "è ò à ù ì ä ö ü, ..." as SHIFT_JIS (Kanji) inside an UTF-8 string. But only if the character is connected to a word and isn't standalone.

For example:

"Ö Just a simple test"

"ÖJust a simple test"

I tried the following without success:

byte[] b = resultString.getBytes("Shift_JIS");
String value = new String(b, "UTF-8");

UPDATE 1:

That's the content of "resultString".

Note: The byte array shown, is without any modifications (such as getBytes("Shift_JIS"), it's just the resultString as bytes)

Do you have any ideas? Any help would be greatly appreciated. Thank you.

解决方案

Well, very strange:

As

byte[] b = resultString.getBytes("Shift_JIS");
String value = new String(b, "UTF-8");

didn't work for me I tried the following:

String value = new String(resultString.getBytes("SHIFT-JIS"), "UTF-8")

Works like a charm. Maybe it was because of the underscore and lower case character in "Shift_JIS".

这篇关于字符串编码 - Shift_JIS / UTF-8的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆