这是什么字符编码? [英] What character encoding is this?

查看:71
本文介绍了这是什么字符编码?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在与一个Oracle DB连接,该数据库具有一些混乱的编码(根据db属性,为ASCII7,但实际上是对韩文字符进行编码的.)

I'm interfacing with an Oracle DB, which has some messed up encoding (ASCII7 according to the db properties, but actually encodes Korean characters).

当我从resultSet中获取一些韩文字符串并查看字节时,结果发现它们与该文件完全对应(我通过对某些字节序列进行谷歌搜索发现):

When I get some of the Korean strings from the resultSet, and look at the bytes, it turns out that they correspond exactly to this file (I found by googling some of the byte sequences): http://211.115.85.9/files/raw3.txt

Kinda令人毛骨悚然,因为它似乎是互联网上唯一与此特定编码有关的东西...

Kinda spooky, as it seems to be the ONLY thing on the internet that has anything about this particular encoding...

使用EditPlus3查看文件时,该文件向我显示3列.

The file, when viewed with EditPlus3, shows me 3 columns.

第一列是按字母顺序列出的韩文字符.第二个是通过查看从Oracle DB传递的Java字符串发现的奇怪编码.第三个是UTF8.

The first column is an alphabetical listing of Korean characters. The second is the strange encoding I'm finding from looking at the Java strings passed from the Oracle DB. The third one is UTF8.

我正在尝试找出中间列的编码方式.有人能指出我正确的方向吗?

I'm trying to figure out what the middle column is encoded in. Can anyone point me in the right direction?

(我真的不想每次需要调用DB时都从该文件中实际读取...)

(I really don't want to have to actually read from this file every time I need to call a DB...)

推荐答案

它是EUC-KR(或类似的)编码数据,解释为另一种1字节编码(ISO-8859-1或类似),并使用UTF编码-8.

It is EUC-KR (or a similar) encoded data, interpreted as another 1-byte encoding (ISO-8859-1 or similar) and encoded using UTF-8.

换句话说:它是编码不良的数据,但可能是可挽救的:

In other words: it's ill-encoded data, but might be salvagable:

byte[] bytes = new byte[] { (byte) 0xc2, (byte) 0xb0, (byte) 0xc2, (byte) 0xa1 };
String str = new String(bytes, "UTF-8");
bytes = str.getBytes("ISO-8859-1");
str = new String(bytes, "EUC-KR");
System.out.println(str);

这会在我的系统上打印出本钱.

This prints 가 on my system.

我发现此PDF文件,它详细说明了问题(及其发生方式).

I've found this PDF file which explains the problem (and how it happend) in more detail.

这篇关于这是什么字符编码?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆