RegExp:上一次出现的模式发生在另一个模式之前 [英] RegExp: Last occurence of pattern that occurs before another pattern

查看:77
本文介绍了RegExp:上一次出现的模式发生在另一个模式之前的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想采用一种文本模式,该模式在另一种文本模式之前排在最后.

I want to take a text pattern that occurs the last before another text pattern.

例如,我有这段文字:

code 4ab6-7b5
Another lorem ipsum
Random commentary.

code f6ee-304
Lorem ipsum text 
Dummy text

code: ebf6-649
Other random text
id-x: 7662dd41-29b5-9646-a4bc-1f6e16e8095e

code: abcd-ebf
Random text
id-x: 7662dd41-29b5-9646-a4bc-1f6e16e8095e

我想获取第一次出现id-x之前出现的最后一个code(这意味着我想获取代码ebf6-649)

I want to take the last code that occurs before the first occurrence of id-x (which means I want to get code ebf6-649)

我该如何使用正则表达式?

How can I do that with regexp?

推荐答案

如果您的正则表达式支持 lookaheads ,您可以使用像这样的解决方案

If your regex flavor supports lookaheads, you can use a solution like this

^code:[ ]([0-9a-f-]+)(?:(?!^code:[ ])[\s\S])*id-x

您可以在捕获编号1中找到您的结果.

And you can find your result in capture number 1.

它如何工作?

^code:[ ]           # match "code: " at the beginning of a line, the square 
                    # brackets are just to aid readability. I recommend always
                    # using them for literal spaces.

(                   # capturing group 1, your key
  [0-9a-f-]+        # match one or more hex-digits or hyphens
)                   # end of group 1

(?:                 # start a non-capturing group; each "instance" of this group
                    # will match a single arbitrary character that does not start
                    # a new "code: " (hence this cannot go beyond the current
                    # block)

  (?!               # negative lookahead; this does not consume any characters,
                    # but causes the pattern to fail, if its subpattern could
                    # match here

    ^code:[ ]       # match the beginning of a new block (i.e. "code: " at the
                    # beginning of another line

  )                 # end of negative lookahead, if we've reached the beginning
                    # of a new block, this will cause the non-capturing group to
                    # fail. otherwise just ignore this.

  [\s\S]            # match one arbitrary character
)*                  # end of non-capturing group, repeat 0 or more times
id-x                # match "id-x" literally

(?:(?!stopword)[\s\S])*模式可让您尽可能地匹配,而不会超出stopword的另一次出现.

The (?:(?!stopword)[\s\S])* pattern let's you match as much as possible without going beyond another occurrence of stopword.

请注意,对于^,您可能必须使用某种形式的多行模式才能在行的开头进行匹配.如果您的random text包含open:,则^对于避免误报很重要.

Note that you might have to use some form of multi-line mode for ^ to match at the beginning of a line. The ^ is important to avoid false negatives, if your random text contains open:.

工作演示(使用Ruby的regex风格,因为我不确定您最终选择的是哪种使用)

Working demo (using Ruby's regex flavor, as I'm not sure which one you are ultimately going to use)

这篇关于RegExp:上一次出现的模式发生在另一个模式之前的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆