如何在子查询的结果上使用正则表达式? [英] How to use regexp on the results of a sub query?
问题描述
我有两个桌子.
用户 其中具有 id 和电话号码
id phone_no
1 ---- 9912678
1 ---- 9912678
2 ---- 9912323
2 ---- 9912323
3 ---- 9912366
3 ---- 9912366
入场表,其中具有 id 电话号码
id 电话号码
6 --- 991267823
6 --- 991267823
7 --- 991236621
7 --- 991236621
8 --- 435443455
8 --- 435443455
9 --- 243344333
9 --- 243344333
我想找到准入表中所有与用户表和更新电话号码 >在用户表中.
I want to find all the phone number of Admission's table which has same pattern as users table and update it in users table.
所以我正在尝试
select phone_no from admission where phone_no REGEXP (SELECT phone_no
FROM `users` AS user
WHERE user.phone_no REGEXP '^(99)+[0-9]{8}')
但是我收到此错误子查询返回的行多于1
寻求帮助.
推荐答案
尝试以下查询之一:
SELECT a.phone_no
FROM admission a
JOIN users u on a.phone_no LIKE concat(u.phone_no, '__')
WHERE u.phone_no REGEXP '^(99)+[0-9]+$'
或
SELECT a.phone_no
FROM admission a
JOIN users u on a.phone_no REGEXP concat('^', u.phone_no, '[0-9]{2}$')
WHERE u.phone_no REGEXP '^(99)+[0-9]+$'
如果尾随数字"的数量不确定,您还可以使用:
If the number of "trailing digits" is not fixed, you can also use:
LIKE concat(u.phone_no, '%')
或
REGEXP concat('^', u.phone_no, '[0-9]*$')
但是在这种情况下,如果users.phone_no
可能是另一个users.phone_no
的子序列(例如99123和991234),则可能需要使用SELECT DISTICT a.phone_no
.
But in this case you might need to use SELECT DISTICT a.phone_no
if it is possible that a users.phone_no
is a subsequence of an other users.phone_no
(e.g. 99123 and 991234).
更新
运行了一些测试后,用户表有10K行,准入表有100K行,我来到了以下查询:
After running some tests with 10K rows for users table and 100K rows for admission table i came to the following query:
SELECT a.phone_no
FROM admission a
JOIN users u
ON a.phone_no >= u.phone_no
AND a.phone_no < CONCAT(u.phone_no, 'z')
AND a.phone_no LIKE CONCAT(u.phone_no, '%')
AND a.phone_no REGEXP CONCAT('^', u.phone_no, '[0-9]*$')
WHERE u.phone_no LIKE '99%'
AND u.phone_no REGEXP '^(99)+[0-9]*$'
UNION SELECT 0 FROM (SELECT 0) dummy WHERE 0
这样,您可以使用REGEXP
并仍然具有出色的性能.在我的测试案例中,该查询几乎立即执行.
This way you can use REGEXP
and still have great performance. This query executes almost instantly in my test case.
从逻辑上讲,您仅需要REGEXP条件.但是在较大的表上,查询可能会超时.使用LIKE条件将在REGEXP检查之前过滤结果集.但是,即使使用LIKE,查询也无法很好地执行.由于某种原因,MySQL不对联接使用范围检查.所以我添加了一个明确的范围检查:
Logically you only need the REGEXP conditions. But on bigger tables the query might time out. Using a LIKE condition will filter the result set before REGEXP check. But even using LIKE the query doesn't perform very well. For some reason MySQL doesn't use a range check for the join. So i added an explicit range check:
ON a.phone_no >= u.phone_no
AND a.phone_no < CONCAT(u.phone_no, 'z')
通过此检查,您可以从JOIN部分中删除LIKE条件.
With this check you can remove the LIKE condition from the JOIN part.
UNION部件代替了DISTICT. MySQL似乎将DISTINCT转换为GROUP BY语句,该语句表现不佳.使用带有空结果集的UNION,我强制MySQL在SELECT之后删除重复项.如果您使用固定数量的尾随数字,则可以删除该行.
The UNION part is a replacement for DISTICT. MySQL seems to translate DISTINCT into a GROUP BY statement, which doesn't perform well. Using UNION with an empty result set i force MySQL to remove duplicates after the SELECT. You can remove that line, if you use a fixed number of trailing digits.
您可以根据需要调整REGEXP模式:
You can adjust the REGEXP patterns to your needs:
...
AND a.phone_no REGEXP CONCAT('^', u.phone_no, '[0-9]{2}$')
...
AND u.phone_no REGEXP '^(99)+[0-9]{8}$'
...
如果您只需要REGEXP来检查phone_no的长度,则还可以将LIKE条件与'_'占位符一起使用.
If you only need REGEXP to check the length of the phone_no, you can also use a LIKE condition with the '_' placeholder.
AND a.phone_no LIKE CONCAT(u.phone_no, '__')
...
AND u.phone_no LIKE '99________$'
或将LIKE条件与STR_LENGTH检查结合起来.
or combine a LIKE condition with a STR_LENGTH check.
这篇关于如何在子查询的结果上使用正则表达式?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!