REGEXP性能(与“LIKE”和“=”比较) [英] REGEXP performance (compare with "LIKE" and "=")

查看:1959
本文介绍了REGEXP性能(与“LIKE”和“=”比较)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我使用MySQL。我提出了有关如何在数据库中查询单个字词匹配的问题




  • 有一个答案建议我使用 REGEXP' :<:]] word [[:>:]]'



    这是一个很好的答案,如何从性能角度看这个 REGEXP'[[:<:]] word [[:>::]]'如果我有一个大表,这是否会损害我的应用程序的性能?




code> = 操作,例如 WHERE column_name ='value' REGEXP 操作比 = for large table?





我说, = 是最快的操作,然后 LIKE REGEXP $ b
$ b

正则表达式可以从不在MySQL中使用索引。

= 将使用索引if:




  • 在列上声明索引;

  • 列具有足够的基数(如果超过+/- 20%的行匹配,MySQL将不使用索引,因为在这种情况下执行全表扫描更快);

  • 同一个表上没有其他索引更合适(MySQL每个子查询只能使用一个索引);



考虑到这些和其他一些更深奥的注意事项, = 比较很多比regexp快。



关于



LIKE可以使用索引,如果通配符不是第一个字符。

  SELECT * FROM t LIKE'abc'<<  - (不区分大小写`=)索引
SELECT * FROM t WHERE a LIKE'abc%'<< - 可以使用索引
SELECT * FROM t LIKE'a%'<索引,取决于基数
SELECT * FROM t WHERE LIKE'%a%'<< - 不能使用索引
SELECT * FROM t LIKE'_agf'< - 不能使用索引

性能 like 当使用索引非常接近 = (假设返回相同的行数)。


I am using MySQL. I have asked a question about how to query in database for a single word match here.

  • There is an answer which suggest me to use REGEXP '[[:<:]]word[[:>:]]'

    It is a good answer, however, I am not sure how is this REGEXP '[[:<:]]word[[:>:]]' thing from performance perspective? If I have a large table, is this way harm the performance of my application?

For example, compare with = operation, e.g. WHERE column_name='value', is the REGEXP operation far more slow than = for large table?

  • There is another answer which suggested me to use LIKE, but I think it is not good from performance point of view.

    Then, I googled and found an article which says use LIKE is even faster than REGEXP . I get confused, which way I should use for a single word match query in a large table...

Can I say, = is the fastest operation, then LIKE , and REGEXP is the poorest one from performance perspective?

解决方案

Regarding regexp

The regexp can never use an index in MySQL.
The = will use an index if:

  • an index is declared on the column;
  • the values in the column have sufficient cardinality (if more than +/- 20% of the rows match, MySQL will not use an index, because in that case doing a full table scan is faster);
  • No other indexes on the same table are better suited (MySQL can only use one index per table per subselect);

Considering these and some other more esoteric caveats an = comparison is much faster than a regexp.

Regarding like

LIKE can use an index if the wildcard is not the first char.

SELECT * FROM t WHERE a LIKE 'abc'   <<-- (case insensitive `=`) can use an index
SELECT * FROM t WHERE a LIKE 'abc%'  <<-- can use an index
SELECT * FROM t WHERE a LIKE 'a%'    <<-- can use an index, depending on cardinality
SELECT * FROM t WHERE a LIKE '%a%'   <<-- cannot use an index
SELECT * FROM t WHERE a LIKE '_agf'  <<-- cannot use an index

The performance of like when using an index is very close to = (assuming the same number of rows returned).

这篇关于REGEXP性能(与“LIKE”和“=”比较)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆