UTF-8:一般?斌吗Unicode? [英] UTF-8: General? Bin? Unicode?

查看：55 发布时间：2020/5/14 19:23:40 mysql utf-8 collation

本文介绍了UTF-8:一般?斌吗Unicode?的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我试图弄清楚我应该对各种类型的数据使用哪种排序规则.我将要存储的内容的100％是用户提交的.

I'm trying to figure out what collation I should be using for various types of data. 100% of the content I will be storing is user-submitted.

我的理解是，我应该使用UTF-8通用CI(不区分大小写)，而不是UTF-8 Binary.但是，我找不到UTF-8通用CI和UTF-8 Unicode CI之间的明显区别.

My understanding is that I should be using UTF-8 General CI (Case-Insensitive) instead of UTF-8 Binary. However, I can't find a clear a distinction between UTF-8 General CI and UTF-8 Unicode CI.

我应该在UTF-8 General或UTF-8 Unicode CI列中存储用户提交的内容吗?
UTF-8 Binary适用于什么类型的数据?

推荐答案

通常， utf8_general_ci 比 utf8_unicode_ci 快，但正确性较低.

In general, utf8_general_ci is faster than utf8_unicode_ci, but less correct.

这是区别:

对于任何Unicode字符集，使用_general_ci归类执行的操作要比使用_unicode_ci归类进行的操作更快.例如，utf8_general_ci归类的比较比utf8_unicode_ci的比较更快，但正确性稍差.原因是utf8_unicode_ci支持诸如扩展之类的映射.也就是说，当一个字符与其他字符的组合比较相等时.例如，在德语和其他一些语言中，ß"等于"ss". utf8_unicode_ci还支持收缩和可忽略字符. utf8_general_ci是旧版归类，不支持扩展，收缩或可忽略的字符.它只能在字符之间进行一对一比较.

For any Unicode character set, operations performed using the _general_ci collation are faster than those for the _unicode_ci collation. For example, comparisons for the utf8_general_ci collation are faster, but slightly less correct, than comparisons for utf8_unicode_ci. The reason for this is that utf8_unicode_ci supports mappings such as expansions; that is, when one character compares as equal to combinations of other characters. For example, in German and some other languages "ß" is equal to "ss". utf8_unicode_ci also supports contractions and ignorable characters. utf8_general_ci is a legacy collation that does not support expansions, contractions, or ignorable characters. It can make only one-to-one comparisons between characters.

引用自: http://dev.mysql.com/doc/refman /5.0/zh-CN/charset-unicode-sets.html

有关更多详细说明，请阅读MySQL论坛中的以下文章: http://forums.mysql.com/read.php?103,187048,188748

For more detailed explanation, please read the following post from MySQL forums: http://forums.mysql.com/read.php?103,187048,188748

对于utf8_bin: utf8_general_ci 和 utf8_unicode_ci 均执行不区分大小写的比较.相反， utf8_bin区分大小写(除其他差异外)，因为它会比较字符的二进制值.

As for utf8_bin: Both utf8_general_ci and utf8_unicode_ci perform case-insensitive comparison. In constrast, utf8_bin is case-sensitive (among other differences), because it compares the binary values of the characters.

这篇关于UTF-8:一般?斌吗Unicode?的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

UTF-8:一般?斌吗Unicode? [英] UTF-8: General? Bin? Unicode?

问题描述

推荐答案

相关文章

数据库最新文章

热门教程

热门工具

登录关闭

UTF-8:一般?斌吗Unicode? [英] UTF-8: General? Bin? Unicode?

问题描述

推荐答案

相关文章

数据库最新文章

热门教程

热门工具

登录 关闭

登录关闭