UB:C#的Regex.Match匹配时返回整个字符串,而不是一部分 [英] UB: C#'s Regex.Match returns whole string instead of part when matching
问题描述
注意!这是不会相关正则表达式的问题,整个字符串,而不是
大家好相匹配。
我尝试做
Hi all. I try to do
Match y = Regex.Match(someHebrewContainingLine, @"^.{0,9} - \[(.*)?\s\d{1,3}");
除了其他VS希伯来语怪癖(你怎么样在编辑字符串时替换]对[? ),偶尔返回疯狂的结果:
Aside from the other VS hebrew quirks (how do you like replacing ] for [ when editing the string?), it occasionally returns the crazy results:
Match.Captures.Count = 1;
Match.Captures[0] = whole string! (not expected)
Match.Groups.Count = 2; (not expected)
Match.Groups[0] = whole string again! (not expected)
Match.Groups[1] = (.*)? value (expected).
Regex.Matches()
行事一样方式。
有什么可以对这种行为的一般原因是什么?注意:它不是一个简单的测试字符串,如 Regex.Match这种方式行事( - היי45--, - ({1,5}) - )
的(显示样品不正确!请看看页面的源代码)的,一定有什么用,这使得它贪婪的正则表达式。匹配的字符串包含 [...]
,而只是把它们添加到测试串不造成同样的效果。
What can be a general reason for such behaviour? Note: it's not acting this way on a simple test strings like Regex.Match("-היי45--", "-(.{1,5})-")
(sample is displayed incorrectly!, please look to the page's source code), there must be something with the regex which makes it greedy. The matched string contains [ .... ]
, but simply adding them to test string doesn't causes the same effect.
推荐答案
我的测试正则表达式是从项目的范围,任何人(多数民众赞成在Perl的家伙来到C#会发生什么)不同,因为它没有向前看符号/ lookbehinds。所以这个发现花了一些时间
My test regex was different from any others in the project's scope (thats what happens when Perl guy comes to C#), as it had no lookaheads/lookbehinds. So this discovery took some time.
现在,为什么我们应该叫正则表达式行为的无证,不是未定义:
Now, why we should call Regex behaviour undocumented, not undefined:
让我们做对1.234567890
。
- PCRE的语法:
\.2345678
- 前瞻语法(。):
()(?= \.\d)
- PCRE-like syntax:
(.)\.2345678
- lookahead syntax:
(.)(?=\.\d)
当你做一个正常的比赛,该结果从线,无论你在哪里已经把parentesizes整体匹配部分复制; 。在目前的向前看符号,任何不属于他们被复制的情况下
When you're doing a normal match, the result is copied from whole matched part of line, no matter where you've put the parentesizes; in case of lookaheads present, anything that did not belongs to them is copied.
所以,比赛将返回:
- PCRE:
1.2345678
(2300,这看起来像原始的字符串,我开始在SO大喊大叫这里) - 前瞻:
1
- PCRE:
1.2345678
(at 2300, this looks like original string and I start yelling here at SO) - lookahead:
1
这篇关于UB:C#的Regex.Match匹配时返回整个字符串,而不是一部分的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!