重复捕获组PCRE [英] Repeated capturing group PCRE
问题描述
无法获得为什么使用此正则表达式( regex101 )
Can't get why this regex (regex101)
/[\|]?([a-z0-9A-Z]+)(?:[\(]?[,][\)]?)?[\|]?/g
捕获所有输入,而这( regex101 )
captures all the input, while this (regex101)
/[\|]+([a-z0-9A-Z]+)(?:[\(]?[,][\)]?)?[\|]?/g
仅捕获 | Func
输入字符串为 | Func(param1,param2,param32,param54,param293,par13am,param)|
我又如何以正常方式匹配重复捕获组?例如。我有正则表达式
Also how can i match repeated capturing group in normal way? E.g. i have regex
/\(\(\s*([a-z\_]+){1}(?:\s+\,\s+(\d+)*)*\s*\)\)/gui
输入字符串为(((string,1,2))
。
Regex101说:重复捕获组将仅捕获最后一次迭代。将捕获组放在重复组周围以捕获所有迭代...。我尝试遵循此技巧,但并没有帮助我。
Regex101 says "a repeated capturing group will only capture the last iteration. Put a capturing group around the repeated group to capture all iterations...". I've tried to follow this tip, but it didn't helped me.
推荐答案
您的 / [\ |] +([a-z0-9A-Z] +)(?:[\(]?[,] [\)]?)?[\ |]?/ g
regex不匹配,因为您没有定义匹配括号内单词的模式。您可以将其修复为 \ | +([a-z0-9A-Z ] +)(?: \(?(\w +(?: \s *,\s * \w +)*)\)?)\ |?
,但括号内的所有值都将匹配为一个组,稍后您将不得不将其拆分。
Your /[\|]+([a-z0-9A-Z]+)(?:[\(]?[,][\)]?)?[\|]?/g
regex does not match because you did not define a pattern to match the words inside parentheses. You might fix it as \|+([a-z0-9A-Z]+)(?:\(?(\w+(?:\s*,\s*\w+)*)\)?)?\|?
, but all the values inside parentheses would be matched into one single group that you would have to split later.
不可能获得任意数量的使用PCRE正则表达式进行捕获,如在重复捕获的情况下,仅最后捕获的值存储在组缓冲区中。
It is not possible to get an arbitrary number of captures with a PCRE regex, as in case of repeated captures only the last captured value is stored in the group buffer.
您可能要做的是使用 preg_match_all
捕获初始定界符来进行多重匹配。
What you may do is get mutliple matches with preg_match_all
capturing the initial delimiter.
因此,要匹配第二个字符串,您可以使用
So, to match the second string, you may use
(?:\G(?!\A)\s*,\s*|\|+([a-z0-9A-Z]+)\()\K\w+
请参见 regex演示。
详细信息:
-
(?: \G(?!\A)\s *,\s * | \ | +([a-z0-9A-Z] +)\( )
-前一个匹配项的末尾(\G(?!\A)
)和一个用0+空格括起来的逗号(\s *,\s *
)或1个|
符号(\ \ | +
),然后是1+个字母数字字符(捕获到第1组,([a-z0-9A-Z] +)
)和(
符号(\(
) -
\K
-省略到目前为止匹配的文本 -
\w +
-1个以上的单词字符。
(?:\G(?!\A)\s*,\s*|\|+([a-z0-9A-Z]+)\()
- either the end of the previous match (\G(?!\A)
) and a comma enclosed with 0+ whitespaces (\s*,\s*
), or 1+|
symbols (\|+
), followed with 1+ alphanumeric chars (captured into Group 1,([a-z0-9A-Z]+)
) and a(
symbol (\(
)\K
- omit the text matched so far\w+
- 1+ word chars.
这篇关于重复捕获组PCRE的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!