为什么将表CHARSET设置为utf8mb4,将COLLATION设置为utf8mb4_unicode_520_ci [英] Why is table CHARSET set to utf8mb4 and COLLATION to utf8mb4_unicode_520_ci
问题描述
我最近注意到,每当我启动一个新的WordPress项目时,我的表的排序规则就会自动从utf8_unicode_ci
(当我从phpMyAdmin创建新数据库时选择该选项)更改为utf8mb4_unicode_520_ci
.
I've recently noticed that, when ever I start a new WordPress project, my tables' collation automatically changes from utf8_unicode_ci
(which I select when I create a new DB from phpMyAdmin) to utf8mb4_unicode_520_ci
.
此外,我在phpMyAdmin中的常规设置"下注意到服务器连接归类默认为utf8mb4_unicode_520_ci
.
Also, I've noticed in phpMyAdmin under "General Settings" that server connection Collation defaults to utf8mb4_unicode_520_ci
.
我正在Ubuntu 17.04上运行MySQL Server 5.7.17和phpMyAdmin 4.6.6.
I'm running MySQL Server 5.7.17 and phpMyAdmin 4.6.6 on Ubuntu 17.04.
我的问题如下:
- 为什么会这样?
- 如果可能,如何防止这种情况发生?由于
utf8mb4
,我在将WP网站迁移到不支持它的旧版MySQL服务器时遇到了问题. - 第2点是否明智?在
utf8
上使用字符集utf8mb4
在utf8_unicode_ci
上使用归类utf8mb4_unicode_520_ci
有什么好处?
- Why is this happening?
- If possible, how do I prevent this? Because of
utf8mb4
I've experienced problems when migrating WP sites to an older MySQL server which does not support it. - Is point 2. advisable? Are there any benefits in using charset
utf8mb4
overutf8
, and collationutf8mb4_unicode_520_ci
overutf8_unicode_ci
?
推荐答案
过去,只有utf8
; 将来,现在utf8mb4
将是默认字符集.utf8mb4
是默认字符集.
In the past, there was only utf8
; in the future, now utf8mb4
will be the default character set.utf8mb4
is the default character set.
过去,_general_ci
是默认排序规则;那么_unicode_ci
(Unicode 4.0)更好,然后是_unicode_520_ci
(Unicode 5.20).在将来的版本(MySQL 8.0)中,默认值为_0900_ci_ai
(Unicode 9.0).
In the past, _general_ci
was the default collation; then _unicode_ci
(Unicode 4.0) was better, then _unicode_520_ci
(Unicode 5.20). In the future (MySQL 8.0), the default will be _0900_ci_ai
(Unicode 9.0).
与此同时,这条路充满了MySQL过去的错误所产生的漏洞. WP设计人员正驾驶着一个没有注意到坑洼的大坦克.
Meanwhile, the road is full of potholes generated by MySQL's past mistakes. And WP designers are driving in a big tank that does not notice the potholes.
MySQL 5.6是一个大坑,吞噬了许多WP用户,因为索引限制超过767,加上超长VARCHAR(255)
的WP索引以及使用utf8mb4
的可能性.有了5.7.17,您已经远远超过了它. (您将来向8.0的迁移将不会那么坎bump.)
MySQL 5.6 was a big pothole that swallowed up many a WP user because of a 767 limit on indexes together with WP indexes on the overly-long VARCHAR(255)
and the possibility of using utf8mb4
. You are well past it by having 5.7.17. (Your future move to 8.0 will be less bumpy.)
也就是说,在5.7.7+上新创建的数据库/表/列应该不会遇到767问题,但是从较早版本(5.5.3+)迁移过来的内容可能会出现问题,尤其是当某些原因导致您更改为utf8mb4时
That is, newly created databases/tables/columns on 5.7.7+ should not experience the 767 problem, but things migrated from older versions (5.5.3+) may have issues, especially if something causes you to change to utf8mb4.
该怎么办?我可能会用尽所有空间来拼出所有选项.因此,请提供数据的历史记录,升级路径(如果有),当前设置,表的ROW_FORMAT
,列的CHARACTER SET
和COLLATION
,SHOW VARIABLES LIKE 'char%';
What to do? I'll probably run out of space trying to spell out all the options. So provide the history of the data, the upgrade path (if any), the current settings, the ROW_FORMAT
of the tables, the CHARACTER SET
and COLLATION
of the columns, the output of SHOW VARIABLES LIKE 'char%';
你应该在哪里?对于5.7.7+,请尽可能使用utf8mb4
和utf8mb4_unicode_520_ci
.该字符集为您提供了表情符号和所有中文(utf8没有).该整理是最好的整理方法,尽管您可能很难注意到它的重要性.
Where should you be? For 5.7.7+, utf8mb4
and utf8mb4_unicode_520_ci
wherever practical. That charset gives you Emoji and all of Chinese (utf8 does not). That collation is the best available, although you might be hard pressed to notice where it matters.
注意:归类名称的第一部分是它可以使用的唯一字符集.那是utf8_unicode_ci
与utf8mb4
不兼容.
Note: the first part of the collation name is the only character set that it works with. That is utf8_unicode_ci
does not work with utf8mb4
.
这篇关于为什么将表CHARSET设置为utf8mb4,将COLLATION设置为utf8mb4_unicode_520_ci的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!