MySQL WHERE`character` ='a'与a,A,Ã等匹配.为什么? [英] MySQL WHERE `character` = 'a' is matching a, A, Ã, etc. Why?

查看:175
本文介绍了MySQL WHERE`character` ='a'与a,A,Ã等匹配.为什么?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在MySQL中有以下查询:

I have the following query in MySQL:

SELECT id FROM unicode WHERE `character` = 'a'

unicode包含每个unicode字符以及一个ID(它是整数编码值).由于表的排序规则设置为utf8_unicode_ci,因此我希望上面的查询仅返回97(字母"a").而是返回119行,其中包含许多类似"a"的字母的ID:

The table unicode contains each unicode character along with an ID (it's integer encoding value). Since the collation of the table is set to utf8_unicode_ci, I would have expected the above query to only return 97 (the letter 'a'). Instead, it returns 119 rows containing the IDs of many 'a'-like letters:

一个AÃ...

似乎忽略了字符的大小写和多字节性质.

It seems to be ignoring both case and the multi-byte nature of the characters.

有什么想法吗?

推荐答案

MySQL根据 http:///中描述的Unicode排序算法(UCA)实现xxx_unicode_ci排序规则. /www.unicode.org/reports/tr10/.该排序规则使用版本4.0.0 UCA重键: http://www.unicode.org/Public/UCA/4.0.0/allkeys-4.0.0.txt .

MySQL implements the xxx_unicode_ci collations according to the Unicode Collation Algorithm (UCA) described at http://www.unicode.org/reports/tr10/. The collation uses the version-4.0.0 UCA weight keys: http://www.unicode.org/Public/UCA/4.0.0/allkeys-4.0.0.txt.

完整整理表清楚地表明,在此排序规则,无论字母大小写或重音/修饰,基本字母的大多数变体都是等效的.

The full collation chart makes clear that, in this collation, most variations of a base letter are equivalent irrespective of their lettercase or accent/decoration.

如果只想匹配完全相同的字母,则应使用utf8_bin之类的二进制排序规则.

If you want to only match exact letters, you should use a binary collation such as utf8_bin.

这篇关于MySQL WHERE`character` ='a'与a,A,Ã等匹配.为什么?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆