正则表达式‘(?< =#)[^#] +(?=#)’如何工作? [英] How does the regular expression ‘(?<=#)[^#]+(?=#)’ work?
问题描述
我在C#程序中具有以下正则表达式,但很难理解它:
I have the following regex in a C# program, and have difficulties understanding it:
(?<=#)[^#]+(?=#)
我将其分解为我认为的理解:
I'll break it down to what I think I understood:
(?<=#) a group, matching a hash. what's `?<=`?
[^#]+ one or more non-hashes (used to achieve non-greediness)
(?=#) another group, matching a hash. what's the `?=`?
所以我遇到的问题是?<=
和?<
部分.通过阅读MSDN,?<name>
用于命名组,但是在这种情况下,尖括号永远不会关闭.
So the problem I have is the ?<=
and ?<
part. From reading MSDN, ?<name>
is used for naming groups, but in this case the angle bracket is never closed.
我在文档中找不到?=
,搜索起来真的很困难,因为搜索引擎通常会忽略那些特殊字符.
I couldn't find ?=
in the docs, and searching for it is really difficult, because search engines will mostly ignore those special chars.
推荐答案
它们称为环顾四周;它们允许您断言某个模式是否匹配,而无需实际进行匹配.有4种基本的解决方法:
They are called lookarounds; they allow you to assert if a pattern matches or not, without actually making the match. There are 4 basic lookarounds:
- 积极的解决方法:看看我们是否可以匹配
pattern
...-
(?=pattern)
-...到当前位置的右(向前看 ) -
(?<=pattern)
-...到当前位置的左(在后面)
- Positive lookarounds: see if we CAN match the
pattern
...(?=pattern)
- ... to the right of current position (look ahead)(?<=pattern)
- ... to the left of current position (look behind)
-
(?!pattern)
-...到右 -
(?<!pattern)
-...到左
(?!pattern)
- ... to the right(?<!pattern)
- ... to the left
为方便起见,请环顾四周:
As an easy reminder, for a lookaround:
-
=
是阳性,!
是阴性 -
<
看起来在后面,否则看起来在前面
=
is positive,!
is negative<
is look behind, otherwise it's look ahead
有人可能会争辩说不需要在上述模式中进行四处查找,并且
#([^#]+)#
可以很好地完成工作(提取\1
捕获的字符串以获取非#
的字符串).One might argue that lookarounds in the pattern above aren't necessary, and
#([^#]+)#
will do the job just fine (extracting the string captured by\1
to get the non-#
).不完全是.区别在于,由于环顾四周与
#
不匹配,因此下次尝试查找匹配项时,它可以再次使用".简单地说,环顾四周允许匹配项"重叠.Not quite. The difference is that since a lookaround doesn't match the
#
, it can be "used" again by the next attempt to find a match. Simplistically speaking, lookarounds allow "matches" to overlap.考虑以下输入字符串:
and #one# and #two# and #three#four#
现在,
#([a-z]+)#
将给出以下匹配项(如在rubular.com上看到的那样 ) :Now,
#([a-z]+)#
will give the following matches (as seen on rubular.com):and #one# and #two# and #three#four# \___/ \___/ \_____/
将此与
(?<=#)[a-z]+(?=#)
进行比较,该匹配将匹配:Compare this with
(?<=#)[a-z]+(?=#)
, which matches:and #one# and #two# and #three#four# \_/ \_/ \___/ \__/
不幸的是,这不能在rubular.com上得到证明,因为它不支持向后看.但是,它确实支持前瞻性,因此我们可以使用
#([a-z]+)(?=#)
做类似的事情,它匹配(在rubular上看到的.com ):Unfortunately this can't be demonstrated on rubular.com, since it doesn't support lookbehind. However, it does support lookahead, so we can do something similar with
#([a-z]+)(?=#)
, which matches (as seen on rubular.com):and #one# and #two# and #three#four# \__/ \__/ \____/\___/
参考文献
- regular-expressions.info/Flavor比较
- regular-expressions.info/Flavor Comparison
References
这篇关于正则表达式‘(?< =#)[^#] +(?=#)’如何工作?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
-