如何对子查询的结果使用正则表达式? [英] How to use regexp on the results of a sub query?

查看:38
本文介绍了如何对子查询的结果使用正则表达式?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有两张桌子.

用户其中有 id电话号码

id phone_no

1 ---- 9912678

1 ---- 9912678

2 ---- 9912323

2 ---- 9912323

3 ---- 9912366

3 ---- 9912366

准入表,有id 电话号码

id phone_no

6 --- 991267823

6 --- 991267823

7 --- 991236621

7 --- 991236621

8 --- 435443455

8 --- 435443455

9 --- 243344333

9 --- 243344333

我想查找与 users 表和 update 具有相同模式的 Admission's 表的所有 电话号码> 在用户表中.

I want to find all the phone number of Admission's table which has same pattern as users table and update it in users table.

所以我正在尝试这个

select phone_no  from admission where phone_no REGEXP (SELECT phone_no
FROM  `users` AS user
WHERE user.phone_no REGEXP  '^(99)+[0-9]{8}')

但我收到此错误 子查询返回超过 1 行

寻求帮助.

推荐答案

尝试以下查询之一:

SELECT a.phone_no
FROM admission a
JOIN users u on a.phone_no LIKE concat(u.phone_no, '__')
WHERE u.phone_no REGEXP  '^(99)+[0-9]+$'

SELECT a.phone_no
FROM admission a
JOIN users u on a.phone_no REGEXP concat('^', u.phone_no, '[0-9]{2}$')
WHERE u.phone_no REGEXP  '^(99)+[0-9]+$'

如果尾数"的个数不固定,也可以使用:

If the number of "trailing digits" is not fixed, you can also use:

LIKE concat(u.phone_no, '%')

REGEXP concat('^', u.phone_no, '[0-9]*$')

但在这种情况下,如果 users.phone_no 可能是其他 users 的子序列,您可能需要使用 SELECT DISTICT a.phone_no.phone_no(例如 99123 和 991234).

But in this case you might need to use SELECT DISTICT a.phone_no if it is possible that a users.phone_no is a subsequence of an other users.phone_no (e.g. 99123 and 991234).

更新

在用 10K 行的用户表和 100K 行的准入表运行一些测试后,我得到了以下查询:

After running some tests with 10K rows for users table and 100K rows for admission table i came to the following query:

SELECT a.phone_no
FROM admission a
JOIN users u 
    ON  a.phone_no >= u.phone_no
    AND a.phone_no < CONCAT(u.phone_no, 'z')
    AND a.phone_no LIKE CONCAT(u.phone_no, '%')
    AND a.phone_no REGEXP CONCAT('^', u.phone_no, '[0-9]*$')
WHERE   u.phone_no LIKE  '99%'
    AND u.phone_no REGEXP  '^(99)+[0-9]*$'
UNION SELECT 0 FROM (SELECT 0) dummy WHERE 0

小提琴

这样你可以使用 REGEXP 并且仍然有很好的性能.此查询在我的测试用例中几乎立即执行.

This way you can use REGEXP and still have great performance. This query executes almost instantly in my test case.

从逻辑上讲,您只需要 REGEXP 条件.但在更大的表上,查询可能会超时.使用 LIKE 条件将在 REGEXP 检查之前过滤结果集.但即使使用 LIKE 查询也不能很好地执行.由于某种原因,MySQL 不对连接使用范围检查.所以我添加了一个明确的范围检查:

Logically you only need the REGEXP conditions. But on bigger tables the query might time out. Using a LIKE condition will filter the result set before REGEXP check. But even using LIKE the query doesn't perform very well. For some reason MySQL doesn't use a range check for the join. So i added an explicit range check:

    ON  a.phone_no >= u.phone_no
    AND a.phone_no < CONCAT(u.phone_no, 'z')

通过此检查,您可以从 JOIN 部分中删除 LIKE 条件.

With this check you can remove the LIKE condition from the JOIN part.

UNION 部分是 DISTICT 的替代品.MySQL 似乎将 DISTINCT 转换为 GROUP BY 语句,该语句表现不佳.使用带有空结果集的 UNION 我强制 MySQL 在 SELECT 之后删除重复项.如果您使用固定数量的尾随数字,则可以删除该行.

The UNION part is a replacement for DISTICT. MySQL seems to translate DISTINCT into a GROUP BY statement, which doesn't perform well. Using UNION with an empty result set i force MySQL to remove duplicates after the SELECT. You can remove that line, if you use a fixed number of trailing digits.

您可以根据需要调整 REGEXP 模式:

You can adjust the REGEXP patterns to your needs:

...
    AND a.phone_no REGEXP CONCAT('^', u.phone_no, '[0-9]{2}$')
...
    AND u.phone_no REGEXP  '^(99)+[0-9]{8}$'
...

如果您只需要 REGEXP 来检查 phone_no 的长度,您还可以使用带有 '_' 占位符的 LIKE 条件.

If you only need REGEXP to check the length of the phone_no, you can also use a LIKE condition with the '_' placeholder.

    AND a.phone_no LIKE CONCAT(u.phone_no, '__')
...
    AND u.phone_no LIKE '99________$'

或将 LIKE 条件与 STR_LENGTH 检查结合起来.

or combine a LIKE condition with a STR_LENGTH check.

这篇关于如何对子查询的结果使用正则表达式?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆