将MySql数据从Latin1转换为UTF8 [英] Convert MySql data from Latin1 to UTF8

查看:103
本文介绍了将MySql数据从Latin1转换为UTF8的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

这是一个常见的问题,已经问过很多次了.但是我仍然无法从Google那里得到正确的答案.

This is a common question has been asked for many times. However I still cannot get the right answer from google.

在我的Web应用程序中,有一个用于收集数据的表单,该应用程序和所有数据都以UTF-8收集.但是,错误地将模式和表的集合设置为latin1.此外,在连接期间,使用了"SET NAMES UTF8".

In my web app, there is a form for collecting data, the app and all data is collecting in UTF-8. However, mistakenly, the collection of the schema and the table has been set as latin1. Moreover, during the connection, "SET NAMES UTF8" has been used.

现在,无论我使用哪种转换方法,一些中文数据始终显示为问号(?).以二进制形式查询问题列也显示数据是3f的几个字节,表示几个'?'.

Now some of the data in Chinese is always showing as questing mark(?), no matter what conversion method I use. Query problem columns as binary also shows the data is several bytes of 3f, meaning several '?'s.

如果我的数据仍然能够转换为utf-8并正确显示还是已经丢失?

If my data still be able to convert to utf-8 and shows correctly or already lost?

[UPDATE]

This is not the same question with How to convert an entire MySQL database characterset and collation to UTF-8? because I have done not just convert the entire database and table to UTF-8 but also mysqldump and re-import it into the database. However, none of them works.

[UPDATE 2]

问题不仅在于转换表字符集,还需要了解UTF-8(拉丁语编码系统).

The problem is not just about converting table charset but also need to understand UTF-8, Latin encoding system.

基本知识是:

拉丁文仅使用1个字节,其中8位用于存储.

Latin use only 1 byte which 8 bits for storing.

UTF-8使用动态存储系统,这意味着可能不只是1个字节

UTF-8 use dynamic storing system which means MAY NOT just 1 byte

由于UTF-8编码系统至少需要1位用于标识,所以这意味着仅7位可用于存储与拉丁语的比较.因此,如果字符只需要存储7位,它就可以成功地以UTF-8表示形式存储在拉丁语中.但是,如果数据超过7位,它将被破坏.

Since UTF-8 encoding system needs at least 1 bit for identification, that means only 7 bits could be used for storing compare with Latin. So, if characters just need 7 bits to store, it can successfully store in Latin with UTF-8 representation. However, if data exceed 7 bits, it will be broken.

因此,这样的中文和日语用户需要2到3个字节来进行存储,这将在存储过程中损坏数据,因为UTF-8表示形式的第一个字节已经超出了拉丁语可以存储的范围.

So, such Chinese and Japaneses, it needs 2 to 3 bytes for storing, that will damage the data during storing process because the first byte in UTF-8 representation already exceed the range that Latin can store.

这就是为什么无论我如何更改数据库和表的字符集,它仍然显示?",因为在拉丁语中,超出范围的每个字符都将以十六进制3F中的?"表示./p>

That's why no matter how I change the charset of both the database and the table it still shows '?', because in Latin, every character that out of the range will be presenting in '?', 3F in HEX.

推荐答案

Juste更改整个数据库的字符集:

Juste change the character set of the entire database:

ALTER DATABASE databasename CHARACTER SET utf8 COLLATE utf8_unicode_ci;

当然,您可以在某些桌子上这样做.

And of course you can do it for some table.

在此处进一步查看文档.

OtherWise,如果您的数据已经存储在?"中标记,事实是它已损坏.

OtherWise, if you data are already sotred in "?" marks, the reality is that it is damaged.

这篇关于将MySql数据从Latin1转换为UTF8的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆