在perl和双重编码上打开utf8文件 [英] opening utf8 files on perl and double encoding
问题描述
我有 mysql 数据库,每个表都有 COLLATE='utf8_general_ci'.
I have mysql db which have COLLATE='utf8_general_ci' for every table.
我使用 dbi my $db = DBI->connect($cstring, $user, $password)
连接到表,没有
i connect to the tables with dbi my $db = DBI->connect($cstring, $user, $password)
and without
$db->{mysql_enable_utf8} = 1
$db->do(qq{SET NAMES 'utf8';} );
然后选择表格并使用 Text::CSV to myFile 将其复制到 csv 文件,其中 myFile 被打开,如下所示:
Then select the table and copy it to the csv file using Text::CSV to myFile where myFile is opened like the the below :
binmode(Myfile, ":utf8")
我在不同的表上重复这个过程的问题,这些文件像上面一样打开,但在某些文件上我得到了双重编码,只有当我删除那些特定文件的 binmode 时,问题才解决,而其他文件很好并编码 utf8,如果我为它们删除 binmode,我会在 utf8 编码上遇到问题,这可能是什么问题?
The problem that i repeat this process on different tables with different files which opened like the above but on some files i get double encoding and only if i remove the binmode for those speicfic files the problem is solved while the other files are fine and encoded utf8 and if i remove the binmode for them i get a problem on the utf8 encdoing what could be the problem ?
值得一提的是,我尝试使用:在我的脚本中使用 utf8 并尝试使用
worth to mention i tried to use : use utf8 on my script and also tried to use
$db-> {mysql_enable_utf8} = 1
$db->do(qq{SET NAMES 'utf8';} );
但是问题没有解决.
推荐答案
如果我理解正确,你会看到
If I understand correctly, you see
éëè
你期望的地方
éëè
使用 phpMyAdmin 时.这表明您数据库中的数据是错误的(双重编码).您需要返回并使用正确的数据重新填充数据库.
when using phpMyAdmin. This indicates the data in your database is wrong (double-encoded). You'll need to go back and repopulate your database with the correct data.
如果您无法修复数据库,那么添加以下内容很可能是安全的:
If you can't fix your database, it's most likely safe to just add the following:
utf8::decode($str); # Fix double-encoding
它将尝试从数据库中解码已解码的数据.如果数据是双重编码的,这将修复它.如果数据不是双重编码的,它会静默失败,在 $str
中留下正确的值(假设你的字符串不是很奇怪).
It will attempt to decode the already-decoded data from the database. If the data was double-encoded, this will fix it. If the data wasn't double-encoded, it will fail silently fail, leaving the correct value in $str
(assuming your strings aren't very very weird).
我建议你编写一个小工具,从数据库中读取数据,使用这个技巧修复数据,然后将其正确放回数据库中.
I recommend that you write a small tool that reads the data from the database, uses this trick to fix the data, then puts it back in the database correctly.
这篇关于在perl和双重编码上打开utf8文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!