有没有一种方法可以将所有现有表数据转换为UTF8归类? [英] Is there a way to convert all existing table data to UTF8 collation?

查看:100
本文介绍了有没有一种方法可以将所有现有表数据转换为UTF8归类?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在协助将数据库从MySQL 4升级到MySQL 5.5.我客户端的应用程序服务器也已从JDK 5升级到JDK7.但是,在执行数据库操作时,应用程序运行会引发很多异常.

I am assisting in a database upgrade from MySQL 4 to MySQL 5.5. My client's application server has also been upgraded from JDK 5 to JDK 7. The application runs however throws lots of exceptions when executing database operations.

我发现升级后的数据库在表排序规则和/或表列排序规则中混合使用Latin1泛型,Latin1瑞典语和UTF8泛型,因此大多数JOIN查询失败.

I discovered that the upgraded database uses a mixture of Latin1 generic, Latin1 Swedish and UTF8 generic in table collation and/or table column collation, thus most JOIN queries failed.

有数百个表和数千个表字段,很难手动转换它们.

There are hundreds of tables and thousands of table fields, it will be very difficult to manually convert all of them.

是否有更方便的方法将所有数据表和所有列转换为相同的排序规则?

Is there a more convenient way to convert all data tables and all columns into same collation?

谢谢.

SQLException消息示例,显示JOIN查询失败:

SQLException message example showing that JOIN queries failed:

操作'='的排序规则(latin1_general_ci,IMPLICIT)和(utf8_general_ci,COERCIBLE)的非法混合"

"Illegal mix of collations (latin1_general_ci,IMPLICIT) and (utf8_general_ci,COERCIBLE) for operation '='"

推荐答案

混合使用字符集不会导致查询失败,因为MySQL应该根据需要在字符集之间进行转换.

A mixture of character sets shouldn't cause queries to fail, as MySQL should convert between character sets as required.

但是,如 ALTER TABLE语法所述:

要更改表的默认字符集和所有字符列( CHAR VARCHAR

To change the table default character set and all character columns (CHAR, VARCHAR, TEXT) to a new character set, use a statement like this:

ALTER TABLE tbl_name CONVERT TO CHARACTER SET charset_name;

对于具有 VARCHAR 或一个数据类型的列 TEXT 类型中,CONVERT TO CHARACTER SET将根据需要更改数据类型确保新列足够长以存储与原始列一样多的字符.例如, TEXT 列有两个长度字节,用于存储该字节-列中值的长度,最大为65,535.对于latin1 TEXT 列,每个字符都需要一个字节,因此该列最多可以存储65,535个字符.如果将列转换为utf8,则每个字符最多可能需要三个字节,最大可能长度为3×65,535 = 196,605字节.该长度不适合 TEXT 列的长度字节,因此MySQL将将数据类型转换为 MEDIUMTEXT ,这是其最小的字符串类型长度字节可以记录196,605的值.同样, VARCHAR 列可能会转换为

For a column that has a data type of VARCHAR or one of the TEXT types, CONVERT TO CHARACTER SET will change the data type as necessary to ensure that the new column is long enough to store as many characters as the original column. For example, a TEXT column has two length bytes, which store the byte-length of values in the column, up to a maximum of 65,535. For a latin1 TEXT column, each character requires a single byte, so the column can store up to 65,535 characters. If the column is converted to utf8, each character might require up to three bytes, for a maximum possible length of 3 × 65,535 = 196,605 bytes. That length will not fit in a TEXT column's length bytes, so MySQL will convert the data type to MEDIUMTEXT, which is the smallest string type for which the length bytes can record a value of 196,605. Similarly, a VARCHAR column might be converted to MEDIUMTEXT.

为避免数据类型更改为刚刚描述的类型,请不要使用CONVERT TO CHARACTER SET.而是使用MODIFY更改单个列.例如:

To avoid data type changes of the type just described, do not use CONVERT TO CHARACTER SET. Instead, use MODIFY to change individual columns. For example:

ALTER TABLE t MODIFY latin1_text_col TEXT CHARACTER SET utf8;
ALTER TABLE t MODIFY latin1_varchar_col VARCHAR(M) CHARACTER SET utf8;

如果指定CONVERT TO CHARACTER SET binary,则 CHAR VARCHAR BLOB ).这意味着这些列将不再具有字符集,并且随后的CONVERT TO操作将不适用于它们.

If you specify CONVERT TO CHARACTER SET binary, the CHAR, VARCHAR, and TEXT columns are converted to their corresponding binary string types (BINARY, VARBINARY, BLOB). This means that the columns no longer will have a character set and a subsequent CONVERT TO operation will not apply to them.

如果 charset_name DEFAULT,则使用数据库字符集.

If charset_name is DEFAULT, the database character set is used.

警告

CONVERT TO操作在字符集之间转换列值.如果您在一个字符集中有一列(例如latin1),但是存储的值实际上使用了其他一些不兼容的字符集(例如utf8),这不是您想要的.在这种情况下,您必须为每个此类列执行以下操作:

 Warning 

The CONVERT TO operation converts column values between the character sets. This is not what you want if you have a column in one character set (like latin1) but the stored values actually use some other, incompatible character set (like utf8). In this case, you have to do the following for each such column:

ALTER TABLE t1 CHANGE c1 c1 BLOB;
ALTER TABLE t1 CHANGE c1 c1 TEXT CHARACTER SET utf8;

之所以可行,是因为您与 列.

The reason this works is that there is no conversion when you convert to or from BLOB columns.

这篇关于有没有一种方法可以将所有现有表数据转换为UTF8归类?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆