等于(=)vs.喜欢 [英] Equals(=) vs. LIKE

查看:219
本文介绍了等于(=)vs.喜欢的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

使用SQL时,在 WHERE 子句中使用 = 而不是<$ c $有什么好处? c> Like ?

When using SQL, are there any benefits of using = in a WHERE clause instead of LIKE?

没有任何特殊运算符, Like = 是相同的吧?

Without any special operators, LIKE and = are the same, right?

推荐答案

不同的运算符



Like = 是不同的运算符。这里的大多数答案都集中在通配符支持上,这不是这些运算符之间的唯一区别!

Different Operators

LIKE and = are different operators. Most answers here focus on the wildcard support, which is not the only difference between these operators!

= 是比较运算符,对数字和字符串进行运算。比较字符串时,比较运算符将比较整个字符串

= is a comparison operator that operates on numbers and strings. When comparing strings, the comparison operator compares whole strings.

Like 是一个字符串

为了使事情变得复杂,两个运算符都使用整理,它可能对比较结果产生重要影响。

To complicate matters, both operators use a collation which can have important effects on the result of the comparison.

让我们首先确定一个示例,其中这些运算符产生明显不同的结果。请允许我从MySQL手册中引用:

Let's first identify an example where these operators produce obviously different results. Allow me to quote from the MySQL manual:


按照SQL标准,LIKE在每个字符的基础上执行匹配,因此可以生成结果与=比较运算符不同:

Per the SQL standard, LIKE performs matching on a per-character basis, thus it can produce results different from the = comparison operator:



mysql> SELECT 'ä' LIKE 'ae' COLLATE latin1_german2_ci;
+-----------------------------------------+
| 'ä' LIKE 'ae' COLLATE latin1_german2_ci |
+-----------------------------------------+
|                                       0 |
+-----------------------------------------+
mysql> SELECT 'ä' = 'ae' COLLATE latin1_german2_ci;
+--------------------------------------+
| 'ä' = 'ae' COLLATE latin1_german2_ci |
+--------------------------------------+
|                                    1 |
+--------------------------------------+

请注意,MySQL手册的此页面称为字符串比较函数,而<$ c $没有讨论c> = ,这意味着 = 严格来说不是字符串比较函数。

Please note that this page of the MySQL manual is called String Comparison Functions, and = is not discussed, which implies that = is not strictly a string comparison function.

SQL标准§8.2 描述 = 如何比较字符串:

The SQL Standard § 8.2 describes how = compares strings:


两个字符串的比较确定如下:

The comparison of two character strings is determined as follows:

a)如果X字符的长度不等于Y字符的长度
,那么为了比较,较短的字符串实际上被
替换为
本身通过在一个或多个填充
字符的右侧串联而扩展为更长的
字符串的长度,其中填充根据CS选择字符。如果
CS具有NO PAD属性,则填充字符是与实现相关的
字符,与X和Y字符集中的
字符比
少CS下的任何字符串。否则,填充字符为

a) If the length in characters of X is not equal to the length in characters of Y, then the shorter string is effectively replaced, for the purposes of comparison, with a copy of itself that has been extended to the length of the longer string by concatenation on the right of one or more pad characters, where the pad character is chosen based on CS. If CS has the NO PAD attribute, then the pad character is an implementation-dependent character different from any character in the character set of X and Y that collates less than any string under CS. Otherwise, the pad character is a .

b)X和Y比较的结果由
归类得出

b) The result of the comparison of X and Y is given by the collating sequence CS.

c)根据整理顺序,两个字符串
可能比较相等,即使它们的长度不同或
包含不同的字符序列。当操作
MAX,MIN,DISTINCT引用分组列,而
UNION,EXCEPT和INTERSECT运算符引用字符
字符串时,这些操作从$中选择的特定值b $ ba这样的相等值集取决于实现。

c) Depending on the collating sequence, two strings may compare as equal even if they are of different lengths or contain different sequences of characters. When the operations MAX, MIN, DISTINCT, references to a grouping column, and the UNION, EXCEPT, and INTERSECT operators refer to character strings, the specific value selected by these operations from a set of such equal values is implementation-dependent.

(添加了强调。)

是什么意思?这意味着在比较字符串时, = 运算符只是当前排序规则的一个精简包装。排序规则是一个具有各种用于比较字符串的规则的库。这是来自MySQL的二进制排序规则的示例

What does this mean? It means that when comparing strings, the = operator is just a thin wrapper around the current collation. A collation is a library that has various rules for comparing strings. Here's an example of a binary collation from MySQL:

static int my_strnncoll_binary(const CHARSET_INFO *cs __attribute__((unused)),
                               const uchar *s, size_t slen,
                               const uchar *t, size_t tlen,
                               my_bool t_is_prefix)
{
  size_t len= MY_MIN(slen,tlen);
  int cmp= memcmp(s,t,len);
  return cmp ? cmp : (int)((t_is_prefix ? len : slen) - tlen);
}

这种特殊的排序方式是逐字节比较(这就是为什么称为二进制-它对字符串没有任何特殊含义)。其他归类可以提供更高级的比较。

This particular collation happens to compare byte-by-byte (which is why it's called "binary" — it doesn't give any special meaning to strings). Other collations may provide more advanced comparisons.

例如,这是一个 UTF-8排序规则,支持不区分大小写的比较。代码太长,无法粘贴到此处,但是请转到该链接并阅读 my_strnncollsp_utf8mb4()的正文。该排序规则可以一次处理多个字节,并且可以应用各种转换(例如不区分大小写的比较)。 = 运算符完全从整理的变幻莫测中抽象出来。

For example, here is a UTF-8 collation that supports case-insensitive comparisons. The code is too long to paste here, but go to that link and read the body of my_strnncollsp_utf8mb4(). This collation can process multiple bytes at a time and it can apply various transforms (such as case insensitive comparison). The = operator is completely abstracted from the vagaries of the collation.

SQL标准8.5节描述了 Like 如何比较字符串:

The SQL Standard § 8.5 describes how LIKE compares strings:


谓词

The <predicate>

M喜欢P

如果存在将M划分为子字符串
的情况,则为true,

is true if there exists a partitioning of M into substrings such that:

i)M的子字符串为M的0个或多个连续
<字符表示>和每个<字符
表示>的序列。 ii)如果P的第i个子串说明符是任意的
字符说明符,则第i个子字符串是该子字符串的一部分。

i) A substring of M is a sequence of 0 or more contiguous <character representation>s of M and each <character representation> of M is part of exactly one substring.

ii) M是任意单个
<字符表示>。

ii) If the i-th substring specifier of P is an arbitrary character specifier, the i-th substring of M is any single <character representation>.

iii)如果P的第i个子字符串说明符是任意字符串
说明符,则M的第i个子串是
0或更多<字符表示形式> s的任何序列。

iii) If the i-th substring specifier of P is an arbitrary string specifier, then the i-th substring of M is any sequence of 0 or more <character representation>s.

iv)如果i P的第-个子字符串说明符既不是
任意字符说明符,也不是任意的字符串说明符,根据b的整理顺序,
则M的第i个子字符串等于该子字符串
说明符
< like谓词> ;,但不附加< space>
个字符到M,并且与该子字符串
说明符的长度相同。

v)M的子字符串数为等于P的
个子字符串说明符的数量。

v) The number of substrings of M is equal to the number of substring specifiers of P.

(添加了强调。)

这很罗word,所以让我们分解一下。项ii和iii分别指通配符 _ 。如果 P 不包含任何通配符,则仅适用项目iv。

This is pretty wordy, so let's break it down. Items ii and iii refer to the wildcards _ and %, respectively. If P does not contain any wildcards, then only item iv applies. This is the case of interest posed by the OP.

在这种情况下,它将比较 M 中的每个子字符串(单个字符)。 code>使用当前的排序规则针对 P 中的每个子字符串。

In this case, it compares each "substring" (individual characters) in M against each substring in P using the current collation.

最重要的是,比较字符串时, = 比较整个字符串,而 Like 一次比较一个字符。两种比较都使用当前的排序规则。如本文中的第一个示例所示,这种差异在某些情况下会导致不同的结果。

The bottom line is that when comparing strings, = compares the entire string while LIKE compares one character at a time. Both comparisons use the current collation. This difference leads to different results in some cases, as evidenced in the first example in this post.

您应该使用哪个?没有人能告诉您-您需要使用适合您的用例的内容。不要通过切换比较运算符来过早优化。

Which one should you use? Nobody can tell you that — you need to use the one that's correct for your use case. Don't prematurely optimize by switching comparison operators.

这篇关于等于(=)vs.喜欢的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆