UB:C#的Regex.Match匹配时返回整个字符串,而不是一部分 [英] UB: C#'s Regex.Match returns whole string instead of part when matching

查看:853
本文介绍了UB:C#的Regex.Match匹配时返回整个字符串,而不是一部分的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

注意!这是不会相关正则表达式的问题,整个字符串,而不是

大家好相匹配。
我尝试做

Hi all. I try to do

Match y = Regex.Match(someHebrewContainingLine, @"^.{0,9} - \[(.*)?\s\d{1,3}");



除了其他VS希伯来语怪癖(你怎么样在编辑字符串时替换]对[? ),偶尔返回疯狂的结果:

Aside from the other VS hebrew quirks (how do you like replacing ] for [ when editing the string?), it occasionally returns the crazy results:

Match.Captures.Count = 1;
Match.Captures[0] = whole string! (not expected)
Match.Groups.Count = 2; (not expected)
Match.Groups[0] = whole string again! (not expected)
Match.Groups[1] = (.*)? value (expected).



Regex.Matches()行事一样方式。

有什么可以对这种行为的一般原因是什么?注意:它不是一个简单的测试字符串,如 Regex.Match这种方式行事( - היי45--, - ({1,5}) - )(显示样品不正确!请看看页面的源代码)的,一定有什么用,这使得它贪婪的正则表达式。匹配的字符串包含 [...] ,而只是把它们添加到测试串不造成同样的效果。

What can be a general reason for such behaviour? Note: it's not acting this way on a simple test strings like Regex.Match("-היי45--", "-(.{1,5})-") (sample is displayed incorrectly!, please look to the page's source code), there must be something with the regex which makes it greedy. The matched string contains [ .... ], but simply adding them to test string doesn't causes the same effect.

推荐答案

我的测试正则表达式是从项目的范围,任何人(多数民众赞成在Perl的家伙来到C#会发生什么)不同,因为它没有向前看符号/ lookbehinds。所以这个发现花了一些时间

My test regex was different from any others in the project's scope (thats what happens when Perl guy comes to C#), as it had no lookaheads/lookbehinds. So this discovery took some time.

现在,为什么我们应该叫正则表达式行为的无证,不是未定义

Now, why we should call Regex behaviour undocumented, not undefined:

让我们做对1.234567890


  • PCRE的语法: \.2345678

  • 前瞻语法(。):()(?= \.\d)

  • PCRE-like syntax: (.)\.2345678
  • lookahead syntax: (.)(?=\.\d)

当你做一个正常的比赛,该结果从线,无论你在哪里已经把parentesizes整体匹配部分复制; 。在目前的向前看符号,任何不属于他们被复制的情况下

When you're doing a normal match, the result is copied from whole matched part of line, no matter where you've put the parentesizes; in case of lookaheads present, anything that did not belongs to them is copied.

所以,比赛将返回:


  • PCRE: 1.2345678 (2300,这看起来像原始的字符串,我开始在SO大喊大叫这里)

  • 前瞻: 1

  • PCRE: 1.2345678 (at 2300, this looks like original string and I start yelling here at SO)
  • lookahead: 1

这篇关于UB:C#的Regex.Match匹配时返回整个字符串,而不是一部分的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆