POSIX正则表达式:表达式中是否包含单词? [英] POSIX Regular Expressions: Excluding a word in an expression?

查看:129
本文介绍了POSIX正则表达式:表达式中是否包含单词?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试使用可在C程序代码中使用的POSIX(扩展)正则表达式来创建正则表达式.

I am trying to create a regular expression using POSIX (Extended) Regular Expressions that I can use in my C program code.

具体来说,我想出了以下内容,但是,我想在匹配的表达式中排除单词"http".经过一些搜索,它看起来并不像POSIX那样明显可以捕获特定的字符串.在下面的示例中,我使用的是负向外观"(即(?! http:)).但是,我担心这可能仅适用于除POSIX以外的方言中定义的正则表达式. 是否允许负前瞻? POSIX是否允许逻辑NOT运算符(即)?

Specifically, I have come up with the following, however, I want to exclude the word "http" within the matched expressions. Upon some searching, it doesn't look like POSIX makes it obvious for catching specific strings. I am using something called a "negative look-a-head" in the below example (i.e. the (?!http:) ). However, I fear that this may only be something available to regular expressions defined in dialects other than POSIX. Is negative lookahead allowed? Is the logical NOT operator allowed in POSIX (i.e. ! )?

有效的正则表达式示例:

Working regular expression example:

href|HREF|src[[:space:]]=[[:space:]]\"(?!http:)[^\"]+\"[/]

href|HREF|src[[:space:]]=[[:space:]]\"(?!http:)[^\"]+\"[/]

如果我不能像其他方言一样使用否定先行词,该如何对上述正则表达式过滤掉特定的单词"http:"?理想情况下,是否有没有逆逻辑并最终在过程中创建一个荒谬的长正则表达式的方法? (我上面提到的那一堆已经很长了,如果可能的话,我宁愿不要看起来更加混乱)

If I cannot use negative-lookahead like in other dialects, what can I do to the above regular expression to filter out the specific word "http:"? Ideally, is there any way without inverse logic and ultimately creating a ridiculously long regular expression in the process? (the one I have above is quite long already, I'd rather it not look more confusing if possible)

[注意:我已经在Stack Overflow中查询了其他相关的线程,但是最相关的线程似乎只是一般地问这个问题,这意味着给出的答案并不一定意味着它们是POSIX风格的==>一两个线程,我已经看到了上述 (?!insertWordToExcludeHere) 否定的前瞻性,但我担心它仅适用于PHP.)

[NOTE: I have consulted other related threads in Stack Overflow, but the most relevant ones seem to only ask this question "generically", which means answers given didn't necessarily mean they were POSIX-flavored ==> in another thread or two, I've seen the above (?!insertWordToExcludeHere) negative lookahead, but I fear it's only for PHP.)

[注2:我也将采用任何POSIX正则表达式措词,我们将不胜感激.是否有人建议替换掉(?! http:)的正则​​表达式看起来像什么以及如何将其适合我当前的正则表达式?]

[NOTE 2: I will take any POSIX regular expression phrasings as well, any help would be appreciated. Does anyone have a suggestion on how whatever regular expression that would filter out "http:" would look like and how it could be fit into my current regular expression, replacing the (?!http:)?]

推荐答案

根据 http://www.regular-expressions.info/refflavors.html 前行和后行不在POSIX中.

According to http://www.regular-expressions.info/refflavors.html lookaheads and lookbehinds are not in the POSIX flavour.

如果您的问题过于复杂而无法清晰地表达为正则表达式,则可以考虑使用词汇化(词法化)和解析.

You may consider thinking in terms of lexing (tokenization) and parsing if your problem is too complex to be represented cleanly as a regex.

这篇关于POSIX正则表达式:表达式中是否包含单词?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆