RegExp:上一次出现的模式发生在另一个模式之前 [英] RegExp: Last occurence of pattern that occurs before another pattern
问题描述
我想采用一种文本模式,该模式在另一种文本模式之前排在最后.
I want to take a text pattern that occurs the last before another text pattern.
例如,我有这段文字:
code 4ab6-7b5
Another lorem ipsum
Random commentary.
code f6ee-304
Lorem ipsum text
Dummy text
code: ebf6-649
Other random text
id-x: 7662dd41-29b5-9646-a4bc-1f6e16e8095e
code: abcd-ebf
Random text
id-x: 7662dd41-29b5-9646-a4bc-1f6e16e8095e
我想获取第一次出现id-x
之前出现的最后一个code
(这意味着我想获取代码ebf6-649
)
I want to take the last code
that occurs before the first occurrence of id-x
(which means I want to get code ebf6-649
)
我该如何使用正则表达式?
How can I do that with regexp?
推荐答案
如果您的正则表达式支持 lookaheads ,您可以使用像这样的解决方案
If your regex flavor supports lookaheads, you can use a solution like this
^code:[ ]([0-9a-f-]+)(?:(?!^code:[ ])[\s\S])*id-x
您可以在捕获编号1
中找到您的结果.
And you can find your result in capture number 1
.
它如何工作?
^code:[ ] # match "code: " at the beginning of a line, the square
# brackets are just to aid readability. I recommend always
# using them for literal spaces.
( # capturing group 1, your key
[0-9a-f-]+ # match one or more hex-digits or hyphens
) # end of group 1
(?: # start a non-capturing group; each "instance" of this group
# will match a single arbitrary character that does not start
# a new "code: " (hence this cannot go beyond the current
# block)
(?! # negative lookahead; this does not consume any characters,
# but causes the pattern to fail, if its subpattern could
# match here
^code:[ ] # match the beginning of a new block (i.e. "code: " at the
# beginning of another line
) # end of negative lookahead, if we've reached the beginning
# of a new block, this will cause the non-capturing group to
# fail. otherwise just ignore this.
[\s\S] # match one arbitrary character
)* # end of non-capturing group, repeat 0 or more times
id-x # match "id-x" literally
(?:(?!stopword)[\s\S])*
模式可让您尽可能地匹配,而不会超出stopword
的另一次出现.
The (?:(?!stopword)[\s\S])*
pattern let's you match as much as possible without going beyond another occurrence of stopword
.
请注意,对于^
,您可能必须使用某种形式的多行模式才能在行的开头进行匹配.如果您的random text
包含open:
,则^
对于避免误报很重要.
Note that you might have to use some form of multi-line mode for ^
to match at the beginning of a line. The ^
is important to avoid false negatives, if your random text
contains open:
.
工作演示(使用Ruby的regex风格,因为我不确定您最终选择的是哪种使用)
Working demo (using Ruby's regex flavor, as I'm not sure which one you are ultimately going to use)
这篇关于RegExp:上一次出现的模式发生在另一个模式之前的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!