等于 (=) 与 LIKE [英] Equals(=) vs. LIKE

查看:26
本文介绍了等于 (=) 与 LIKE的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在使用 SQL 时,在 WHERE 子句中使用 = 代替 LIKE 有什么好处吗?

When using SQL, are there any benefits of using = in a WHERE clause instead of LIKE?

没有任何特殊操作符,LIKE= 是一样的吧?

Without any special operators, LIKE and = are the same, right?

推荐答案

不同的操作符

LIKE= 是不同的操作符.这里的大多数答案都集中在通配符支持上,这并不是这些运算符之间的唯一区别!

Different Operators

LIKE and = are different operators. Most answers here focus on the wildcard support, which is not the only difference between these operators!

= 是对数字和字符串进行操作的比较运算符.比较字符串时,比较运算符比较整个字符串.

= is a comparison operator that operates on numbers and strings. When comparing strings, the comparison operator compares whole strings.

LIKE 是一个字符串运算符,用于比较一个字符.

LIKE is a string operator that compares character by character.

为了使事情复杂化,两个操作符都使用了排序规则 对比较结果有重要影响.

To complicate matters, both operators use a collation which can have important effects on the result of the comparison.

让我们首先确定一个示例,其中这些运算符产生明显不同的结果.请允许我引用 MySQL 手册:

Let us first identify an example where these operators produce obviously different results. Allow me to quote from the MySQL manual:

根据 SQL 标准,LIKE 基于每个字符执行匹配,因此它可以产生与 = 比较运算符不同的结果:

Per the SQL standard, LIKE performs matching on a per-character basis, thus it can produce results different from the = comparison operator:

mysql> SELECT 'ä' LIKE 'ae' COLLATE latin1_german2_ci;
+-----------------------------------------+
| 'ä' LIKE 'ae' COLLATE latin1_german2_ci |
+-----------------------------------------+
|                                       0 |
+-----------------------------------------+
mysql> SELECT 'ä' = 'ae' COLLATE latin1_german2_ci;
+--------------------------------------+
| 'ä' = 'ae' COLLATE latin1_german2_ci |
+--------------------------------------+
|                                    1 |
+--------------------------------------+

请注意,MySQL手册的这一页称为字符串比较函数,没有讨论=,这意味着=是不是严格意义上的字符串比较函数.

Please note that this page of the MySQL manual is called String Comparison Functions, and = is not discussed, which implies that = is not strictly a string comparison function.

SQL 标准 § 8.2 描述了如何= 比较字符串:

The SQL Standard § 8.2 describes how = compares strings:

两个字符串的比较确定如下:

The comparison of two character strings is determined as follows:

a) 如果 X 的字符长度不等于长度在 Y 的字符中,则较短的字符串有效出于比较的目的,替换为本身已经延伸到更长的长度一个或多个垫右侧的串联字符串字符,其中填充字符是根据 CS 选择的.如果CS有NO PAD属性,那么pad字符是一个实现依赖的字符不同于任何X 和 Y 的字符集中较少整理的字符比 CS 下的任何字符串.否则,填充字符是<空格>.

a) If the length in characters of X is not equal to the length in characters of Y, then the shorter string is effectively replaced, for the purposes of comparison, with a copy of itself that has been extended to the length of the longer string by concatenation on the right of one or more pad characters, where the pad character is chosen based on CS. If CS has the NO PAD attribute, then the pad character is an implementation-dependent character different from any character in the character set of X and Y that collates less than any string under CS. Otherwise, the pad character is a <space>.

b) X 和 Y 的比较结果由整理序列 CS.

c) 根据整理顺序,两个字符串可能即使它们的长度不同,也比较相等或包含不同的字符序列.当操作MAX、MIN、DISTINCT、对分组列的引用,以及UNION、EXCEPT 和 INTERSECT 运算符引用字符字符串,这些操作从一组这样的相等值取决于实现.

c) Depending on the collating sequence, two strings may compare as equal even if they are of different lengths or contain different sequences of characters. When the operations MAX, MIN, DISTINCT, references to a grouping column, and the UNION, EXCEPT, and INTERSECT operators refer to character strings, the specific value selected by these operations from a set of such equal values is implementation-dependent.

(添加了强调.)

这是什么意思?这意味着在比较字符串时,= 运算符只是当前排序规则的一个薄包装器.排序规则是一个库,它具有用于比较字符串的各种规则.这是来自 MySQL 的二进制排序规则的示例:

What does this mean? It means that when comparing strings, the = operator is just a thin wrapper around the current collation. A collation is a library that has various rules for comparing strings. Here is an example of a binary collation from MySQL:

static int my_strnncoll_binary(const CHARSET_INFO *cs __attribute__((unused)),
                               const uchar *s, size_t slen,
                               const uchar *t, size_t tlen,
                               my_bool t_is_prefix)
{
  size_t len= MY_MIN(slen,tlen);
  int cmp= memcmp(s,t,len);
  return cmp ? cmp : (int)((t_is_prefix ? len : slen) - tlen);
}

这种特殊的排序规则恰好是逐字节比较的(这就是它被称为二进制"的原因——它没有赋予字符串任何特殊含义).其他排序规则可能会提供更高级的比较.

This particular collation happens to compare byte-by-byte (which is why it's called "binary" — it doesn't give any special meaning to strings). Other collations may provide more advanced comparisons.

例如,这里是一个 UTF-8 排序规则 支持不区分大小写的比较.代码太长,无法在此处粘贴,但请转到该链接并阅读 my_strnncollsp_utf8mb4() 的正文.此整理可以一次处理多个字节,并且可以应用各种转换(例如不区分大小写的比较).= 运算符完全从排序规则的变幻莫测中抽象出来.

For example, here is a UTF-8 collation that supports case-insensitive comparisons. The code is too long to paste here, but go to that link and read the body of my_strnncollsp_utf8mb4(). This collation can process multiple bytes at a time and it can apply various transforms (such as case insensitive comparison). The = operator is completely abstracted from the vagaries of the collation.

SQL 标准 § 8.5 描述了如何LIKE 比较字符串:

The SQL Standard § 8.5 describes how LIKE compares strings:

<谓词>

M LIKE P

如果存在将 M 划分为子串,则为真使得:

is true if there exists a partitioning of M into substrings such that:

i) M 的子串是 0 个或多个连续的序列M和每个<字符的<字符表示>代表>的 M 恰好是一个子串的一部分.

i) A substring of M is a sequence of 0 or more contiguous <character representation>s of M and each <character representation> of M is part of exactly one substring.

ii) 如果 P 的第 i 个子串说明符是任意的字符说明符,M 的第 i 个子串是任意单个<字符表示>.

ii) If the i-th substring specifier of P is an arbitrary character specifier, the i-th substring of M is any single <character representation>.

iii) 如果 P 的第 i 个子字符串说明符是任意字符串说明符,则 M 的第 i 个子串是0个或多个<字符表示>.

iii) If the i-th substring specifier of P is an arbitrary string specifier, then the i-th substring of M is any sequence of 0 or more <character representation>s.

iv) 如果 P 的第 i 个子串说明符既不是任意字符说明符或任意字符串说明符,那么 M 的第 i 个子串等于那个子串根据整理顺序的说明符<like谓词>,没有附加<空格>字符转换为 M,并且与该子串的长度相同说明符.

v) M 的子串数等于P 的子串说明符.

v) The number of substrings of M is equal to the number of substring specifiers of P.

(添加了强调.)

这很罗嗦,让我们分解一下.项ii和iii分别指通配符_%.如果 P 不包含任何通配符,则仅适用第 iv 项.这是 OP 提出的兴趣案例.

This is pretty wordy, so let's break it down. Items ii and iii refer to the wildcards _ and %, respectively. If P does not contain any wildcards, then only item iv applies. This is the case of interest posed by the OP.

在这种情况下,它比较每个子字符串";(单个字符)在 M 中使用当前排序规则针对 P 中的每个子字符串.

In this case, it compares each "substring" (individual characters) in M against each substring in P using the current collation.

最重要的是,在比较字符串时,= 比较整个字符串,而 LIKE 一次比较一个字符.两种比较都使用当前的排序规则.这种差异在某些情况下会导致不同的结果,如本文第一个示例所示.

The bottom line is that when comparing strings, = compares the entire string while LIKE compares one character at a time. Both comparisons use the current collation. This difference leads to different results in some cases, as evidenced in the first example in this post.

您应该使用哪个?没有人可以告诉您——您需要使用适合您用例的那个.不要通过切换比较运算符来过早地优化.

Which one should you use? Nobody can tell you that — you need to use the one that's correct for your use case. Don't prematurely optimize by switching comparison operators.

这篇关于等于 (=) 与 LIKE的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆