正则表达式太贪心了 [英] Regular expression is being too greedy
问题描述
我试图写一个正则表达式,但是太贪心了。输入字符串可以采用以下格式之一:
I am trying to write a regular expression but it’s being too greedy. The input string could be in either of the following formats:
STUFF_12_1234 or STUFF_1234
我要做的是创建一个正则表达式以捕获最后一个 _
之后的字符。因此,在上述示例中,数字将为 1234。最后一个 _
之后的字符数有所不同,它们可以是字母和数字的组合。我尝试过以下表达式:
What I want to do is to create a regular expression to grab the characters after the last _
. So in the above examples that would be the numbers "1234". The number of characters after this last _
varies and they could be a combination of letters and numbers. I have tried the following expression:
_(.*?)\Z
此功能通过返回 1234来用于 STUFF_1234,但是当我将其用于 STUFF_12_1234时,它将返回 12_1234
This works for "STUFF_1234" by returning "1234" but when I use it against "STUFF_12_1234" it returns "12_1234"
有人建议如何更改表达式以解决此问题吗?
Anyone advise on how the expression should be changed to fix this?
推荐答案
至少3种方法来捕获最后一个下划线后出现的文本 _
:
There are at least 3 ways to grab the text appearing after the last underscore _
:
-
保留当前的正则表达式,但指定
RightToLeft
RegexOptions
。由于正则表达式是从右到左搜索的,因此惰性量词将匹配字符串中最后一个_
之后的尽可能少的字符。
Keep the current regex, but specify
RightToLeft
RegexOptions
. Since the regex is searched from right to left, the lazy quantifier will match as few character as possible up to just right after the last_
in the string.
修改正则表达式以禁止在要匹配的文本中使用下划线 _
:
Modify the regex to disallow underscore _
in the text you want to match:
_([^_]*)\Z
用 _
分割输入字符串,然后选择最后一项。为此 String.Split
就足够了,不需要 Regex.Split
。
Split the input string by _
and pick the last item. For this String.Split
is sufficient, no need for Regex.Split
.
这篇关于正则表达式太贪心了的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!