正则表达式负前瞻 [英] Regular expression negative lookahead
问题描述
在主目录中,我有一个文件夹drupal-6.14,其中包含Drupal平台.
In my home directory I have a folder drupal-6.14 that contains the Drupal platform.
在此目录中,我使用以下命令:
From this directory I use the following command:
find drupal-6.14 -type f -iname '*' | grep -P 'drupal-6.14/(?!sites(?!/all|/default)).*' | xargs tar -czf drupal-6.14.tar.gz
此命令的作用是gzip压缩文件夹 drupal-6.14 ,不包括 drupal-6.14/sites/ 的所有子文件夹,网站/所有和网站/默认除外,其中包括.
What this command does is gzips the folder drupal-6.14, excluding all subfolders of drupal-6.14/sites/ except sites/all and sites/default, which it includes.
我的问题是关于正则表达式:
My question is on the regular expression:
grep -P 'drupal-6.14/(?!sites(?!/all|/default)).*'
表达式起作用可以排除我要排除的所有文件夹,但是我不太明白为什么.
The expression works to exclude all the folders I want excluded, but I don't quite understand why.
这是一项常见的任务,使用正则表达式来
It is a common task using regular expressions to
匹配所有字符串,但不包含子模式x的除外.换句话说,就是否定子模式.
Match all strings, except those that don't contain subpattern x. Or in other words, negating a subpattern.
我(认为)我了解解决这些问题的一般策略是使用消极先行,但我从未完全满意地理解过(正面/后方)正面和反面的工作方式.
I (think) I understand that the general strategy to solve these problems is the use of negative lookaheads, but I've never understood to a satisfactory level how positive and negative look(ahead/behind)s work.
这些年来,我已经阅读了许多网站. PHP和Python regex手册以及其他页面,例如 http://www.regular-expressions.info/lookaround .html 等等,但是我从来没有真的对它们有扎实的了解.
Over the years, I've read many websites on them. The PHP and Python regex manuals, other pages like http://www.regular-expressions.info/lookaround.html and so forth, but I've never really had a solid understanding of them.
有人可以解释一下它是如何工作的,也许还会提供一些类似的例子来做类似的事情?
Could someone explain, how this is working, and perhaps provide some similar examples that would do similar things?
-更新一:
关于Andomar的回答:双重否定前瞻能否更简洁地表达为单个肯定前瞻性陈述:
Regarding Andomar's response: can a double negative lookahead be more succinctly expressed as a single positive lookahead statement:
即是:
'drupal-6.14/(?!sites(?!/all|/default)).*'
等同于:
'drupal-6.14/(?=sites(?:/all|/default)).*'
???
-更新二:
根据@andomar和@alan moore-您不能将双负前瞻互换为正前瞻.
As per @andomar and @alan moore - you can't interchange double negative lookahead for positive lookahead.
推荐答案
否定的前行表示,在此位置,以下正则表达式无法匹配.
A negative lookahead says, at this position, the following regex can not match.
让我们举一个简单的例子:
Let's take a simplified example:
a(?!b(?!c))
a Match: (?!b) succeeds
ac Match: (?!b) succeeds
ab No match: (?!b(?!c)) fails
abe No match: (?!b(?!c)) fails
abc Match: (?!b(?!c)) succeeds
最后一个例子是双重否定:它允许在b
后跟c
.嵌套的负前瞻变为正前瞻:c
应该存在.
The last example is a double negation: it allows a b
followed by c
. The nested negative lookahead becomes a positive lookahead: the c
should be present.
在每个示例中,仅a
被匹配.前瞻仅是一个条件,不会添加到匹配的文本中.
In each example, only the a
is matched. The lookahead is only a condition, and does not add to the matched text.
这篇关于正则表达式负前瞻的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!