正则表达式以任何顺序匹配包含两个名称的字符串 [英] Regex to match string containing two names in any order

查看:62
本文介绍了正则表达式以任何顺序匹配包含两个名称的字符串的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我需要正则表达式中的逻辑 AND.

I need logical AND in regex.

类似的东西

杰克和詹姆斯

同意以下字符串

  • '嗨杰克这里是詹姆斯'

'嗨james这里是jack'

推荐答案

您可以使用正向前瞻进行检查.以下是必不可少的regular-expressions.info站点的摘要:

You can do checks using positive lookaheads. Here is a summary from the indispensable regular-expressions.info site:

前瞻和后视,统称为环视",是零长度断言……环视实际上匹配字符,但是然后放弃匹配,只返回结果:匹配或不匹配.这就是为什么它们被称为断言"的原因.他们不消费字符串中的字符,但只断言是否可能匹配与否.

Lookahead and lookbehind, collectively called "lookaround", are zero-length assertions...lookaround actually matches characters, but then gives up the match, returning only the result: match or no match. That is why they are called "assertions". They do not consume characters in the string, but only assert whether a match is possible or not.

然后继续解释正向前瞻用于断言后面的内容匹配某个表达式,而不会占用匹配表达式中的字符.

It then goes on to explain that positive lookaheads are used to assert that what follows matches a certain expression without taking up characters in that matching expression.

所以这里的表达式使用两个后续的正向前瞻来断言该短语以任一顺序匹配 jackjames:

So here is an expression using two subsequent postive lookaheads to assert that the phrase matches jack and james in either order:

^(?=.*\bjack\b)(?=.*\bjames\b).*$

测试一下.

括号中以 ?= 开头的表达式是正向前瞻.我将分解模式:

The expressions in parentheses starting with ?= are the positive lookaheads. I'll break down the pattern:

  1. ^ 断言要匹配的表达式的开始.
  2. (?=.*\bjack\b) 是第一个正向前瞻,表示后面的内容必须与 .*\bjack\b 匹配.
  3. .* 表示任意字符零次或多次.
  4. \b 表示任何单词边界(空格、表达式开头、表达式结尾等).
  5. jack 实际上就是一行中的四个字符.(对于下一个正向预测中的 james 也是如此.)
  6. $ 断言表达式的结尾与我匹配.
  1. ^ asserts the start of the expression to be matched.
  2. (?=.*\bjack\b) is the first positive lookahead saying that what follows must match .*\bjack\b.
  3. .* means any character zero or more times.
  4. \b means any word boundary (white space, start of expression, end of expression, etc).
  5. jack is literally those four characters in a row. (the same for james in the next positive lookahead.)
  6. $ asserts the end of the expression to me matched.

所以第一个前瞻说接下来的内容(并且本身不是前瞻或后视)必须是一个以零个或多个任何字符开头的表达式,后跟单词边界,然后是jack和另一个词边界,"再往前看,后面的表达式必须以零个或多个任何字符开头,然后是单词边界,然后是 james 和另一个单词边界."在两次前瞻之后是 .* ,它只匹配任何字符零次或多次,而 $ 匹配表达式的结尾.

So the first lookahead says "what follows (and is not itself a lookahead or lookbehind) must be an expression that starts with zero or more of any characters followed by a word boundary and then jack and another word boundary," and the second look ahead says "what follows must be an expression that starts with zero or more of any characters followed by a word boundary and then james and another word boundary." After the two lookaheads is .* which simply matches any characters zero or more times and $ which matches the end of the expression.

从任何东西开始,然后是 jack 或 james,然后以任何东西结束";满足第一个先行,因为有很多字符,然后是单词 jack,它满足第二个先行,因为有很多字符(恰好包括 jack>,但这不是满足第二个前瞻的必要条件)然后是 james.两个前瞻都不会断言表达式的结束,因此后面的 .* 可以超出满足前瞻的内容,例如然后以任何内容结束".

"start with anything then jack or james then end with anything" satisfies the first lookahead because there are a number of characters then the word jack, and it satisfies the second lookahead because there are a number of characters (which just so happens to include jack, but that is not necessary to satisfy the second lookahead) then the word james. Neither lookahead asserts the end of the expression, so the .* that follows can go beyond what satisfies the lookaheads, such as "then end with anything".

我想你明白了,但为了绝对清楚,这里是 jackjames 颠倒,即从任何东西开始,然后詹姆斯或杰克然后以任何东西结束";它满足第一个先行,因为有很多字符,然后是单词 james,它满足第二个先行,因为有很多字符(恰好包括 jamescode>,但这不是满足第二个前瞻的必要条件)然后是 jack.和以前一样,前瞻都没有断言表达式的结束,所以后面的 .* 可以超越满足前瞻的内容,例如然后以任何东西结束".

I think you get the idea, but just to be absolutely clear, here is with jack and james reversed, i.e. "start with anything then james or jack then end with anything"; it satisfies the first lookahead because there are a number of characters then the word james, and it satisfies the second lookahead because there are a number of characters (which just so happens to include james, but that is not necessary to satisfy the second lookahead) then the word jack. As before, neither lookahead asserts the end of the expression, so the .* that follows can go beyond what satisfies the lookaheads, such as "then end with anything".

这种方法的优点是您可以轻松指定多个条件.

This approach has the advantage that you can easily specify multiple conditions.

^(?=.*\bjack\b)(?=.*\bjames\b)(?=.*\bjason\b)(?=.*\bjules\b).*$

这篇关于正则表达式以任何顺序匹配包含两个名称的字符串的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆