PHP正则表达式非捕获非匹配组 [英] PHP regex non-capture non-match group

查看:177
本文介绍了PHP正则表达式非捕获非匹配组的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在与正则表达式进行日期匹配,这一切都很顺利,我已经到目前为止:

 /(?:[0-3])?[0-9]  - (?:[0-1])?[0-9]  - (?:20)[0-1] [0-9] /

(希望)匹配单数或双位数字的数字和数字,双数或四位数二十一世纪的岁月。一些尝试和错误让我得到了这个结果。



但是,我有两个关于这些结果的简单问题:


  1. (?:)这是一个简单的解释?显然这是一个非匹配的组。但是...然后...


  2. 尾随的是什么?例如(?)?



解决方案



这是一个评论和答案。



答案部分...我同意亚历克斯早期的答案。


  1. <$ c $与()相反,c>(?:)用于避免捕获文本,通常会引用较少的后缀与你想要的或提高速度表现。


  2. 遵循(?:) - 或以下任何东西,除了 * +? } - 表示在合法比赛中可能找到或可能找不到上述项目。例如, / z34?/ 将匹配z3以及z34,但不匹配z35或z等。


评论部分...我做了可能被认为是对正在工作的正则表达式的改进:



<$ (0?[1-9] | [1-2] [0-9] | 30 | 31) - (0?[1- 9] | 10 | 11 | 12) - ((?:20)?[0-9] [0-9])(?:\s | $)
pre>

- 首先,它避免了像0-0-2011这样的东西



- 其次,它避免像233443-4-201154564这样的东西



- 第三,它包括1-1-2022这样的东西



- 第四,它包括诸如1-1-11之类的东西



- 第五,它避免了像34-4-11这样的东西。



- 第六,它允许你捕获日,月和年,所以你可以更容易地在代码..代码中引用这些代码,例如,进一步检查(第二次捕获组2,并且是第一个捕获组29,这是闰年,否则第一个捕获组<29),以查看是否一个feb 29日期是否合格。



最后,请注意,您仍然会收到不存在的日期,例如31-6-11。如果你想避免这些,请尝试:

 (?:^ | \s)(?:(? 0?[1-9] | [1-2] [0-9] | 30 | 31) - (0?[1078] | 10 | 12))|(?:( 0?[1-9] 1-2] [0-9] | 30) - (0?[469] | 11))|(?:( 0?[1-9] | [1-2] [0-9]) - (0 (2))) - ((?:20)?[0-9] [0-9])(?:\s | $)

此外,我假设日期将在之前和后面跟着一个空格(或乞讨/行尾),但是您可能需要调整(例如,允许标点符号)。



其他引用此资源的评论者可能会发现有用的:
http://rubular.com/


I'm making a date matching regex, and it's all going pretty well, I've got this so far:

"/(?:[0-3])?[0-9]-(?:[0-1])?[0-9]-(?:20)[0-1][0-9]/"

It will (hopefully) match single or double digit days and months, and double or quadruple digit years in the 21st century. A few trials and errors have gotten me this far.

But, I've got two simple questions regarding these results:

  1. (?: ) what is a simple explanation for this? Apparently it's a non-matching group. But then...

  2. What is the trailing ? for? e.g. (? )?

解决方案

[Edited (again) to improve formatting and fix the intro.]

This is a comment and an answer.

The answer part... I do agree with alex' earlier answer.

  1. (?: ), in contrast to ( ), is used to avoid capturing text, generally so as to have fewer back references thrown in with those you do want or to improve speed performance.

  2. The ? following the (?: ) -- or when following anything except * + ? or {} -- means that the preceding item may or may not be found within a legitimate match. Eg, /z34?/ will match z3 as well as z34 but it won't match z35 or z etc.

The comment part... I made what might considered to be improvements to the regex you were working on:

(?:^|\s)(0?[1-9]|[1-2][0-9]|30|31)-(0?[1-9]|10|11|12)-((?:20)?[0-9][0-9])(?:\s|$)

-- First, it avoids things like 0-0-2011

-- Second, it avoids things like 233443-4-201154564

-- Third, it includes things like 1-1-2022

-- Forth, it includes things like 1-1-11

-- Fifth, it avoids things like 34-4-11

-- Sixth, it allows you to capture the day, month, and year so you can refer to these more easily in code.. code that would, for example, do a further check (is the second captured group 2 and is either the first captured group 29 and this a leap year or else the first captured group is <29) in order to see if a feb 29 date qualified or not.

Finally, note that you'll still get dates that won't exist, eg, 31-6-11. If you want to avoid these, then try:

(?:^|\s)(?:(?:(0?[1-9]|[1-2][0-9]|30|31)-(0?[13578]|10|12))|(?:(0?[1-9]|[1-2][0-9]|30)-(0?[469]|11))|(?:(0?[1-9]|[1-2][0-9])-(0?2)))-((?:20)?[0-9][0-9])(?:\s|$)

Also, I assumed the dates would be preceded and followed by a space (or beg/end of line), but you may want ot adjust that (eg, to allow punctuations).

A commenter elsewhere referenced this resource which you might find useful: http://rubular.com/

这篇关于PHP正则表达式非捕获非匹配组的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆