正则表达式的可变长度后视断言替代方案 [英] Variable-length lookbehind-assertion alternatives for regular expressions

查看:37
本文介绍了正则表达式的可变长度后视断言替代方案的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在 Python/PHP/JavaScript 中是否有支持可变长度后视断言的正则表达式实现?

/(?

如何编写具有相同含义但不使用后向断言的正则表达式?

这种断言有可能在某一天实施吗?

事情比我想象的要好得多.

更新:

(1) 已经有支持变长lookbehind-assertion的正则表达式实现.

Python 模块 regex(不是标准的 re,而是附加的 regex 模块)支持这样的断言(并且有许多其他很酷的特性).

<预><代码>>>>导入正则表达式>>>m = regex.search('(?<!foo.*)bar', 'f00bar')>>>打印 m.group()酒吧>>>m = regex.search('(?<!foo.*)bar', 'foobar')>>>打印米没有任何

让我感到非常惊讶的是,正则表达式中有一些 Perl 无法做到而 Python 可以做到的事情.也许,Perl 也有增强的正则表达式"实现?

(感谢并为 MRAB +1).

(2) 现代正则表达式中有一个很酷的特性K.

这个符号意味着当你进行替换时(在我看来,断言最有趣的用例是替换),在 K 之前找到的所有字符都不能改变.

s/unchanged-partKchanged-part/new-part/x

这几乎就像一个后视断言,但当然不是那么灵活.

更多关于K:

据我所知,您不能在同一个正则表达式中使用 K 两次.而且你不能说到什么时候你想杀死"你找到的角色.那总是直到行的开头.

(感谢并为池上 +1).

我的其他问题:

  • 是否可以说K效果的最终点必须是什么点?
  • Perl/Ruby/JavaScript/PHP 的增强正则表达式实现怎么样?类似于 Python 的 regex .

解决方案

大多数时候,你可以通过使用 K 来避免变长后视.

s/(?<=foo.*)bar/moo/s;

s/foo.*Kbar/moo/s;

直到遇到的最后一个 K 的任何内容都不会被视为匹配的一部分(例如,出于替换的目的,$& 等)

否定后视有点棘手.

s/(?

s/^(?:(?!foo).)*Kbar/moo/s;

因为 (?:(?!STRING).)*STRING 就像 [^CHAR]*字符.


如果您只是匹配,您甚至可能不需要 K.

/foo.*bar/s/^(?:(?!foo).)*bar/s

Is there an implementation of regular expressions in Python/PHP/JavaScript that supports variable-length lookbehind-assertion?

/(?<!foo.*)bar/

How can I write a regular expression that has the same meaning, but uses no lookbehind-assertion?

Is there a chance that this type of assertion will be implemented some day?

Things are much better that I thought.

Update:

(1) There are regular expressions implementation that support variable-length lookbehind-assertion already.

Python module regex (not standard re, but additional regex module) supports such assertions (and has many other cool features).

>>> import regex
>>> m = regex.search('(?<!foo.*)bar', 'f00bar')
>>> print m.group()
bar
>>> m = regex.search('(?<!foo.*)bar', 'foobar')
>>> print m
None

It was a really big surprise for me that there is something in regular expressions that Perl can't do and Python can. Probably, there is "enhanced regular expression" implementation for Perl also?

(Thanks and +1 to MRAB).

(2) There is a cool feature K in modern regular expressions.

This symbols means that when you make a substitution (and from my point of view the most interesting use case of assertions is the substitution), all characters that were found before K must not be changed.

s/unchanged-partKchanged-part/new-part/x

That is almost like a look-behind assertion, but not so flexible of course.

More about K:

As far as I understand, you can't use K twice in the same regular expression. And you can't say till which point you want to "kill" the characters that you've found. That is always till the beginning of the line.

(Thanks and +1 to ikegami).

My additional questions:

  • Is it possible to say what point must be the final point of K effect?
  • What about enhanced regular expressions implementations for Perl/Ruby/JavaScript/PHP? Something like regex for Python.

解决方案

Most of the time, you can avoid variable length lookbehinds by using K.

s/(?<=foo.*)bar/moo/s;

would be

s/foo.*Kbar/moo/s;

Anything up to the last K encountered is not considered part of the match (e.g. for the purposes of replacement, $&, etc)

Negative lookbehinds are a little trickier.

s/(?<!foo.*)bar/moo/s;

would be

s/^(?:(?!foo).)*Kbar/moo/s;

because (?:(?!STRING).)* is to STRING as [^CHAR]* is to CHAR.


If you're just matching, you might not even need the K.

/foo.*bar/s

/^(?:(?!foo).)*bar/s

这篇关于正则表达式的可变长度后视断言替代方案的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆