修复来自MySQL的字符串编码不正确 [英] Fixing Incorrect String Encoding From MySQL
问题描述
我正在从未设置为Unicode的mysql数据库中读取字符串。
I'm reading strings from a mysql database which isn't set up for Unicode.
Ruby将字符串作为ä¸ƒå ¬§æ'<
但我知道正确的版本应该是七大洋
。 错误字符串被编码为UTF-8,因为Ruby不知道它是错误的。我试过强制每个编码的错误字符串,但没有什么工作。我有一种感觉,我可能能够通过fiddling的位,但我不知道从哪里开始。
Ruby gets the string as 七大洋
but I know the correct version should be 七大洋
. The "wrong" string is encoded as UTF-8 because Ruby doesn't know it has it wrong. I've tried forcing every encoding on the mangled string but nothing works. I have a feeling that I might be able to do it by fiddling with the bits but I don't even know where to start.
我不认为任何信息已丢失,因为错误的字符串实际上比正确的字符串具有多个字节。我不认为Ruby是这里的罪魁祸首,因为当我查看Ruby外面的表格时,字符串也看起来很糟糕 - 所以我希望撤销MySQL已经做的损害。
I don't think any information has been lost because the incorrect string actually has more bytes than the correct one. I don't think Ruby is the culprit here because the strings also look mangled when I view the table outside Ruby - so I'm hoping to undo the damage that MySQL has already done.
推荐答案
您可以使用以下构造来恢复编码:
You can use following construction to revert encoding:
"wrong_string".encode(Encoding::SOME_ENCODING).force_encoding('utf-8')
编码以检测正确的编码:
I tried all possible encodings to detect right encoding:
Encoding.constants.each_with_object({}) do |encoding_name, result|
value = "七大洋".encode(Encoding.const_get encoding_name).force_encoding('utf-8') rescue nil
result[encoding_name] = value if value == "七大洋"
end.keys
#=> [:Windows_1252, :WINDOWS_1252, :CP1252, :Windows_1254, :WINDOWS_1254, :CP1254]
要将字符串转换为七大洋
,您可以使用上面的任何编码。
Thus, to convert your string to 七大洋
you can use any encoding from above.
这篇关于修复来自MySQL的字符串编码不正确的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!