REGEXP性能(与“LIKE”和“=”比较) [英] REGEXP performance (compare with "LIKE" and "=")
问题描述
我使用MySQL。我提出了有关如何在数据库中查询单个字词匹配的问题
-
有一个答案建议我使用
REGEXP' :<:]] word [[:>:]]'
这是一个很好的答案,如何从性能角度看这个
REGEXP'[[:<:]] word [[:>::]]'
如果我有一个大表,这是否会损害我的应用程序的性能?
code> = 操作,例如 WHERE column_name ='value'
是 REGEXP
操作比 =
for large table?
-
还有另一个答案建议我使用LIKE,
然后,我google了,发现了一个 article ,说使用
LIKE
strong>REGEXP
。我很困惑,我应该使用一个单一的词匹配查询在一个大表...
我说, =
是最快的操作,然后 LIKE
和 REGEXP $ c
$ b
正则表达式可以从不在MySQL中使用索引。
=
将使用索引if:
- 在列上声明索引;
- 列具有足够的基数(如果超过+/- 20%的行匹配,MySQL将不使用索引,因为在这种情况下执行全表扫描更快);
- 同一个表上没有其他索引更合适(MySQL每个子查询只能使用一个索引);
考虑到这些和其他一些更深奥的注意事项, =
比较很多比regexp快。
关于
LIKE可以使用索引,如果通配符不是第一个字符。
SELECT * FROM t LIKE'abc'<< - (不区分大小写`=)索引
SELECT * FROM t WHERE a LIKE'abc%'<< - 可以使用索引
SELECT * FROM t LIKE'a%'<索引,取决于基数
SELECT * FROM t WHERE LIKE'%a%'<< - 不能使用索引
SELECT * FROM t LIKE'_agf'< - 不能使用索引
性能 like
当使用索引非常接近 =
(假设返回相同的行数)。
I am using MySQL. I have asked a question about how to query in database for a single word match here.
There is an answer which suggest me to use
REGEXP '[[:<:]]word[[:>:]]'
It is a good answer, however, I am not sure how is this
REGEXP '[[:<:]]word[[:>:]]'
thing from performance perspective? If I have a large table, is this way harm the performance of my application?
For example, compare with =
operation, e.g. WHERE column_name='value'
, is the REGEXP
operation far more slow than =
for large table?
There is another answer which suggested me to use LIKE, but I think it is not good from performance point of view.
Then, I googled and found an article which says use
LIKE
is even faster thanREGEXP
. I get confused, which way I should use for a single word match query in a large table...
Can I say, =
is the fastest operation, then LIKE
, and REGEXP
is the poorest one from performance perspective?
Regarding regexp
The regexp can never use an index in MySQL.
The =
will use an index if:
- an index is declared on the column;
- the values in the column have sufficient cardinality (if more than +/- 20% of the rows match, MySQL will not use an index, because in that case doing a full table scan is faster);
- No other indexes on the same table are better suited (MySQL can only use one index per table per subselect);
Considering these and some other more esoteric caveats an =
comparison is much faster than a regexp.
Regarding like
LIKE can use an index if the wildcard is not the first char.
SELECT * FROM t WHERE a LIKE 'abc' <<-- (case insensitive `=`) can use an index
SELECT * FROM t WHERE a LIKE 'abc%' <<-- can use an index
SELECT * FROM t WHERE a LIKE 'a%' <<-- can use an index, depending on cardinality
SELECT * FROM t WHERE a LIKE '%a%' <<-- cannot use an index
SELECT * FROM t WHERE a LIKE '_agf' <<-- cannot use an index
The performance of like
when using an index is very close to =
(assuming the same number of rows returned).
这篇关于REGEXP性能(与“LIKE”和“=”比较)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!