扩展正则表达式是否支持反向引用? [英] Do extended regexes support back-references?

查看:86
本文介绍了扩展正则表达式是否支持反向引用?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

Wikipedia 扩展了正则表达式放弃了对反向引用的支持",因此基本"正则表达式模式必须用于启用这些功能.但是,似乎许多实现 do 支持扩展正则表达式的反向引用.例如,Ubuntu Precise上的gcc 4.6支持它们. FreeBSD实现似乎仅在基本模式下支持它们.

Wikipedia says that extended regexes "dropped support for backreferences", so the "basic" regex mode has to be used to enable those. However, it seems that a number of implementations do support backreferences for extended regexes. For example, with gcc 4.6 on Ubuntu Precise, they are supported. FreeBSD implementations seem to support them only in basic mode.

Boost表示(似乎与Wikipedia一致)扩展的正则表达式不支持反向引用,但是Boost :: Regex将它们添加为扩展.

Boost says (and seems to agree with Wikipedia) that backreferences are not supported for extended regexes, but Boost::Regex adds them as an extension.

难道这只是标准中定义不明确的部分,每个实现对它的解释都不同吗?

Is this just a poorly defined part of the standard which is interpreted differently by every implementation?

推荐答案

正如其他人已经指出的那样,很明显POSIX ERE不支持反向引用.

As others have already pointed out, it's quite clear that POSIX EREs do not support back-references.

OpenGroup 基本规范第7期中给出的不向ERE添加反向引用的理由如下:

The rationale given in the OpenGroup Base Specifications Issue 7 for not adding back-references to EREs is given as:

建议除了间隔表达式外,还应将反向引用('\ n')添加到ERE中.标准开发人员拒绝了这种做法,因为这可能会降低共识.

It was suggested that, in addition to interval expressions, back-references ( '\n' ) should also be added to EREs. This was rejected by the standard developers as likely to decrease consensus.

引用自:定义:基本定义:扩展的正则表达式

此限制的主要原因是允许将POSIX ERE转换为确定性有限自动机(DFA),实际上Unix中ERE的原始实现是作为DFA完成的.使用DFA可以保证实现的性能.与(无数个)反向引用进行模式匹配是一个NP难题,甚至可能是一个NP完全难题.如果对ERE提出反向引用,则永远无法在POSIX标准委员会中达成共识,因为这将迫使所有使用原始Unix实现的公司将其代码更改为非确定性实现,并放弃其性能保证,还有一些这些公司中有委员会成员.

The primary reason for this limitation is to allow POSIX EREs to be converted to a deterministic finite automata (DFA), and indeed the original implementation of EREs in Unix was done as a DFA. The use of a DFA allows guarantees to be made about the performance of the implementation. Pattern matching with (an unbounded number of) back references is an NP-hard problem, and perhaps even an NP-complete problem. Consensus in the POSIX standards committee could never have been reached if back-references were proposed for EREs because that would force all the companies using the original Unix implementation to change their code to a non-deterministic implementation and to drop their performance guarantees, and some of those companies had members on the committee.

也已经注意到,RE中的反向引用对于用户或实施者都不直观,实际上,它们比现在更频繁地引起极大的混乱.例如,请参见 RE解释中给出的示例:黑暗角落

It has also been noted that back-references in REs are not intuitive for either users or implementors, and indeed they've caused extreme confusion more often than now. See for example the examples given in RE-Interpretation: The Dark Corners

注意:RE中的反向引用与sed等工具中替换文本中对子模式的引用不同.

NOTE: back-references in REs are not the same as references to sub-patterns in substitution text in tools such as sed.

这篇关于扩展正则表达式是否支持反向引用?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆