在perl和双重编码上打开utf8文件 [英] opening utf8 files on perl and double encoding

查看:37
本文介绍了在perl和双重编码上打开utf8文件的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有 mysql 数据库,每个表都有 COLLATE='utf8_general_ci'.

I have mysql db which have COLLATE='utf8_general_ci' for every table.

我使用 dbi my $db = DBI->connect($cstring, $user, $password) 连接到表,没有

i connect to the tables with dbi my $db = DBI->connect($cstring, $user, $password) and without

$db->{mysql_enable_utf8} = 1
$db->do(qq{SET NAMES 'utf8';} );

然后选择表格并使用 Text::CSV to myFile 将其复制到 csv 文件,其中 myFile 被打开,如下所示:

Then select the table and copy it to the csv file using Text::CSV to myFile where myFile is opened like the the below :

binmode(Myfile, ":utf8")

我在不同的表上重复这个过程的问题,这些文件像上面一样打开,但在某些文件上我得到了双重编码,只有当我删除那些特定文件的 binmode 时,问题才解决,而其他文件很好并编码 utf8,如果我为它们删除 binmode,我会在 utf8 编码上遇到问题,这可能是什么问题?

The problem that i repeat this process on different tables with different files which opened like the above but on some files i get double encoding and only if i remove the binmode for those speicfic files the problem is solved while the other files are fine and encoded utf8 and if i remove the binmode for them i get a problem on the utf8 encdoing what could be the problem ?

值得一提的是,我尝试使用:在我的脚本中使用 utf8 并尝试使用

worth to mention i tried to use : use utf8 on my script and also tried to use

 $db-> {mysql_enable_utf8} = 1
    $db->do(qq{SET NAMES 'utf8';} );

但是问题没有解决.

推荐答案

如果我理解正确,你会看到

If I understand correctly, you see

éëè

你期望的地方

éëè

使用 phpMyAdmin 时.这表明您数据库中的数据是错误的(双重编码).您需要返回并使用正确的数据重新填充数据库.

when using phpMyAdmin. This indicates the data in your database is wrong (double-encoded). You'll need to go back and repopulate your database with the correct data.

如果您无法修复数据库,那么添加以下内容很可能是安全的:

If you can't fix your database, it's most likely safe to just add the following:

utf8::decode($str);  # Fix double-encoding

它将尝试从数据库中解码已解码的数据.如果数据是双重编码的,这将修复它.如果数据不是双重编码的,它会静默失败,在 $str 中留下正确的值(假设你的字符串不是很奇怪).

It will attempt to decode the already-decoded data from the database. If the data was double-encoded, this will fix it. If the data wasn't double-encoded, it will fail silently fail, leaving the correct value in $str (assuming your strings aren't very very weird).

我建议你编写一个小工具,从数据库中读取数据,使用这个技巧修复数据,然后将其正确放回数据库中.

I recommend that you write a small tool that reads the data from the database, uses this trick to fix the data, then puts it back in the database correctly.

这篇关于在perl和双重编码上打开utf8文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆