在正则表达式中匹配两个单词之间的一些字符 [英] Matching two words with some characters in between in regular expression

查看:80
本文介绍了在正则表达式中匹配两个单词之间的一些字符的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

当没有 abc 后跟某些字符(可能没有)并以 .com 结尾时,我想对字符串进行匹配.

I want to do a match for a string when no abc is followed by some characters (possibly none) and ends with .com.

我尝试了以下内容:

(?!abc).*\.com

(?!abc).*?\.com

(?<!abc).*\.com

(?<!abc).*?\.com

但是这些都没有奏效.请帮忙.

But none of these worked. Please help.

非常感谢!

编辑

对不起,如果我没有说清楚.只是举几个例子.我想要 def.eduabc.comabce.comeabc.comabcAnYTHing.com 不匹配,而 a.comb.comab.comae.com 等匹配.

Sorry if I did not make myself clear. Just give some examples. I want def.edu, abc.com, abce.com, eabc.com and abcAnYTHing.com do not match, while a.com, b.com, ab.com, ae.com etc. match.

推荐答案

如果您想匹配以 .com 结尾且不包含 abc 在此之前;或匹配没有abc 后跟字符后跟 .com"的字符串.

It's unclear from your wording if you want to match a string ending with .com AND NOT containing abc before that; or to match a string that doesn't have "abc followed by characters followed by .com".

意思是,在第一种情况下,"def.edu" 不匹配(没有abc"但不以.com"结尾),但在第二种情况下"def.edu" 匹配(因为它不是abcSOMETHING.com")

Meaning, in the first case, "def.edu" does NOT match (no "abc" but doesn't end with ".com") but in the second case "def.edu" matches (because it's not "abcSOMETHING.com")

在第一种情况下,您需要使用否定后视:

In the first case, you need to use negative look-behind:

(?<!abc.+)\.com$
# Use .* instead of .+ if you want "abc.com" to fail as well

重要:您使用后视的原始表达式 - #3 ( (?<!abc).*\.com ) - 没有工作,因为后视只会在下一学期之前立即查看.因此,abc 之后的东西"应该与 abc 一起包含在后视中 - 正如我上面的 RegEx 所做的那样.

IMPORTANT: your original expression using look-behind - #3 ( (?<!abc).*\.com ) - didn't work because look-behind ONLY looks behind immediately preceding the next term. Therefore, the "something after abc" should be included in the look-behind together with abc - as my RegEx above does.

问题:我上面的 RegEx 可能无法与您的特定 RegEx 引擎一起使用,除非它支持具有可变长度表达式的一般后视(就像上面的那个) - 现在只有 .NET 做了(关于什么支持和不支持什么风格的后视是在 http://www.regular-expressions.info/lookaround.html).

PROBLEM: my RegEx above likely won't work with your specific RegEx Engine, unless it supports general look-behinds with variable length expression (like the one above) - which ONLY .NET does these days (A good summary of what does and doesn't support what flavors of look-behind is at http://www.regular-expressions.info/lookaround.html ).

如果确实如此,您将不得不进行双重匹配:首先,检查 .com;捕捉它之前的一切;然后在 abc 上负匹配.我将使用 Perl 语法,因为您没有指定语言:

If that is indeed the case, you will have to do double match: first, check for .com; capturing everything before it; then negative match on abc. I will use Perl syntax since you didn't specify a language:

if (/^(.*)\.com$/) {
    if ($1 !~ /abc/) { 
    # Or, you can just use a substring:
    # if (index($1, "abc") < 0) {
        # PROFIT!
    }
}

<小时>

在第二种情况下,最简单的方法是使用不匹配"运算符 - 例如!~ 在 Perl 中(如果您的语言不支持不匹配",则否定匹配的结果).使用伪代码的示例:


In the second case, the EASIEST thing to do is to do a "does not match" operator - e.g. !~ in Perl (or negate a result of a match if your language doesn't support "does not match"). Example using pseudo-code:

if (NOT string.match(/abc.+\.com$/)) ...

请注意,使用负向后视时不需要.+"/.*";

Please note that you don't need ".+"/".*" when using negative lookbehind;

这篇关于在正则表达式中匹配两个单词之间的一些字符的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆