正则表达式负前瞻 [英] Regular expression negative lookahead

查看:172
本文介绍了正则表达式负前瞻的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在主目录中,我有一个文件夹drupal-6.14,其中包含Drupal平台.

In my home directory I have a folder drupal-6.14 that contains the Drupal platform.

在此目录中,我使用以下命令:

From this directory I use the following command:

find drupal-6.14 -type f -iname '*' | grep -P 'drupal-6.14/(?!sites(?!/all|/default)).*' | xargs tar -czf drupal-6.14.tar.gz

此命令的作用是gzip压缩文件夹 drupal-6.14 ,不包括 drupal-6.14/sites/ 的所有子文件夹,网站/所有和网站/默认除外,其中包括.

What this command does is gzips the folder drupal-6.14, excluding all subfolders of drupal-6.14/sites/ except sites/all and sites/default, which it includes.

我的问题是关于正则表达式:

My question is on the regular expression:

grep -P 'drupal-6.14/(?!sites(?!/all|/default)).*'

表达式起作用可以排除我要排除的所有文件夹,但是我不太明白为什么.

The expression works to exclude all the folders I want excluded, but I don't quite understand why.

这是一项常见的任务,使用正则表达式来

It is a common task using regular expressions to

匹配所有字符串,但包含子模式x的除外.换句话说,就是否定子模式.

Match all strings, except those that don't contain subpattern x. Or in other words, negating a subpattern.

我(认为)我了解解决这些问题的一般策略是使用消极先行,但我从未完全满意地理解过(正面/后方)正面和反面的工作方式.

I (think) I understand that the general strategy to solve these problems is the use of negative lookaheads, but I've never understood to a satisfactory level how positive and negative look(ahead/behind)s work.

这些年来,我已经阅读了许多网站. PHP和Python regex手册以及其他页面,例如 http://www.regular-expressions.info/lookaround .html 等等,但是我从来没有真的对它们有扎实的了解.

Over the years, I've read many websites on them. The PHP and Python regex manuals, other pages like http://www.regular-expressions.info/lookaround.html and so forth, but I've never really had a solid understanding of them.

有人可以解释一下它是如何工作的,也许还会提供一些类似的例子来做类似的事情?

Could someone explain, how this is working, and perhaps provide some similar examples that would do similar things?

-更新一:

关于Andomar的回答:双重否定前瞻能否更简洁地表达为单个肯定前瞻性陈述:

Regarding Andomar's response: can a double negative lookahead be more succinctly expressed as a single positive lookahead statement:

即是:

'drupal-6.14/(?!sites(?!/all|/default)).*'

等同于:

'drupal-6.14/(?=sites(?:/all|/default)).*'

???

-更新二:

根据@andomar和@alan moore-您不能将双负前瞻互换为正前瞻.

As per @andomar and @alan moore - you can't interchange double negative lookahead for positive lookahead.

推荐答案

否定的前行表示,在此位置,以下正则表达式无法匹配.

A negative lookahead says, at this position, the following regex can not match.

让我们举一个简单的例子:

Let's take a simplified example:

a(?!b(?!c))

a      Match: (?!b) succeeds
ac     Match: (?!b) succeeds
ab     No match: (?!b(?!c)) fails
abe    No match: (?!b(?!c)) fails
abc    Match: (?!b(?!c)) succeeds

最后一个例子是双重否定:它允许在b后跟c.嵌套的负前瞻变为正前瞻:c应该存在.

The last example is a double negation: it allows a b followed by c. The nested negative lookahead becomes a positive lookahead: the c should be present.

在每个示例中,仅a被匹配.前瞻仅是一个条件,不会添加到匹配的文本中.

In each example, only the a is matched. The lookahead is only a condition, and does not add to the matched text.

这篇关于正则表达式负前瞻的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆