在MySQL中将表从utf8更新到utf8mb4是否安全? [英] Is it safe to update tables from utf8 to utf8mb4 in MySQL?

查看:467
本文介绍了在MySQL中将表从utf8更新到utf8mb4是否安全?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我知道以前也曾问过类似的问题,但我们需要一个更明确的答案.

I am aware that similar questions have been asked before, but we need a more definitive answer.

在所有情况下将以utf8编码的MySQL表更新为utf8mb4 是否安全?更具体地说,即使对于使用例如在Java中生成的字符串的varchar字段,也是如此:

Is it safe to update MySQL tables encoded in utf8 to utf8mb4 in all cases. More specifically, even for varchar fields with strings generated using for example (in Java):

new BigInteger(130, random).toString(32)

根据我们的理解,utf8mb4utf8的超集,因此我们的假设是一切都应该很好,但是我们希望更多的MySQL超级用户提供一些输入.

From our understanding utf8mb4 is a superset of utf8 so our assumption would be that everything should be fine, but we would love some input from more MySQL superusers.

推荐答案

最初在MySQL中插入数据的方式无关紧要.假设您使用了utf8的整个字符集,例如BMP字符.

How the data was originally inserted in MySQL is irrelevant. Let's suppose you used the entire character set of utf8, e.g. the BMP characters.

utf8mb4是utf8mb3(别名utf8)的超集,如这里

utf8mb4 is a superset of utf8mb3 (alias utf8) as documented here

10.9.7在3字节和4字节Unicode字符集

从utf8mb3转换为utf8mb4的一个优点是,这使应用程序可以使用补充字符.一种折衷方案是,这可能会增加数据存储空间的需求.

One advantage of converting from utf8mb3 to utf8mb4 is that this enables applications to use supplementary characters. One tradeoff is that this may increase data storage space requirements.

就表内容而言,从utf8mb3到utf8mb4的转换没有问题:

In terms of table content, conversion from utf8mb3 to utf8mb4 presents no problems:

  • 对于BMP字符,utf8mb4和utf8mb3具有相同的存储空间 特点:相同的代码值,相同的编码,相同的长度.

  • For a BMP character, utf8mb4 and utf8mb3 have identical storage characteristics: same code values, same encoding, same length.

对于补充字符,utf8mb4需要四个字节来存储 它,而utf8mb3根本无法存储该字符.什么时候 将utf8mb3列转换为utf8mb4,您无需担心 转换辅助字符,因为将没有辅助字符.

For a supplementary character, utf8mb4 requires four bytes to store it, whereas utf8mb3 cannot store the character at all. When converting utf8mb3 columns to utf8mb4, you need not worry about converting supplementary characters because there will be none.

就表结构而言,这些是潜在的主要不兼容性:

In terms of table structure, these are the primary potential incompatibilities:

  • 对于可变长度字符数据类型(VARCHAR和TEXT类型),utf8mb4列的允许的最大字符长度小于utf8mb3列.

  • For the variable-length character data types (VARCHAR and the TEXT types), the maximum permitted length in characters is less for utf8mb4 columns than for utf8mb3 columns.

对于所有字符数据类型(CHAR,VARCHAR和TEXT类型),utf8mb4列可索引的最大字符数少于utf8mb3列.

For all character data types (CHAR, VARCHAR, and the TEXT types), the maximum number of characters that can be indexed is less for utf8mb4 columns than for utf8mb3 columns.

因此,要将表从utf8mb3转换为utf8mb4,可能有必要更改某些列或索引定义.

Consequently, to convert tables from utf8mb3 to utf8mb4, it may be necessary to change some column or index definitions.

我个人而言,相对较长的文本的索引存在一些问题,这些索引达到了索引的最大大小.它是一个搜索索引,而不是唯一索引,因此解决方法是在索引中使用较少的字符.另请参见此答案

Personally I had some issues with indexes on relative long texts where the maximum size of the index was reached. It was a search index, not a unique index, so the workaround was to use less characters in the index. See also this answer

当然,我想您将使用相同的排序规则.如果您更改排序规则,则可能会遇到其他问题.

Of course I suppose that you will use the same collation. If you change the collation other issues apply.

这篇关于在MySQL中将表从utf8更新到utf8mb4是否安全?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆