正则表达式太贪心了 [英] Regular expression is being too greedy

查看:109
本文介绍了正则表达式太贪心了的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我试图写一个正则表达式,但是太贪心了。输入字符串可以采用以下格式之一:

I am trying to write a regular expression but it’s being too greedy. The input string could be in either of the following formats:

STUFF_12_1234 or STUFF_1234

我要做的是创建一个正则表达式以捕获最后一个 _ 之后的字符。因此,在上述示例中,数字将为 1234。最后一个 _ 之后的字符数有所不同,它们可以是字母和数字的组合。我尝试过以下表达式:

What I want to do is to create a regular expression to grab the characters after the last _. So in the above examples that would be the numbers "1234". The number of characters after this last _ varies and they could be a combination of letters and numbers. I have tried the following expression:

_(.*?)\Z

此功能通过返回 1234来用于 STUFF_1234,但是当我将其用于 STUFF_12_1234时,它将返回 12_1234

This works for "STUFF_1234" by returning "1234" but when I use it against "STUFF_12_1234" it returns "12_1234"

有人建议如何更改表达式以解决此问题吗?

Anyone advise on how the expression should be changed to fix this?

推荐答案

至少3种方法来捕获最后一个下划线后出现的文本 _

There are at least 3 ways to grab the text appearing after the last underscore _:


  • 保留当前的正则表达式,但指定 RightToLeft RegexOptions 。由于正则表达式是从右到左搜索的,因此惰性量词将匹配字符串中最后一个 _ 之后的尽可能少的字符。

  • Keep the current regex, but specify RightToLeft RegexOptions. Since the regex is searched from right to left, the lazy quantifier will match as few character as possible up to just right after the last _ in the string.

修改正则表达式以禁止在要匹配的文本中使用下划线 _

Modify the regex to disallow underscore _ in the text you want to match:

_([^_]*)\Z


  • _ 分割输入字符串,然后选择最后一项。为此 String.Split 就足够了,不需要 Regex.Split

  • Split the input string by _ and pick the last item. For this String.Split is sufficient, no need for Regex.Split.

    这篇关于正则表达式太贪心了的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

  • 查看全文
    登录 关闭
    扫码关注1秒登录
    发送“验证码”获取 | 15天全站免登陆