懒惰(ungreedy)使用正则表达式匹配多个组 [英] Lazy (ungreedy) matching multiple groups using regex

查看:406
本文介绍了懒惰(ungreedy)使用正则表达式匹配多个组的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想获取< tag>< / tag> 标记对之间的任何值的内容。

I would like to grab the contents of any value between pairs of <tag></tag> tags.

<tag>
This is one block of text
</tag>

<tag>
This is another one
</tag>

我想出的正则表达式是

/< tag>(。*)< / tag> / m

虽然它出现了要贪婪并且在最后的< / tag> 中捕获括号内的所有内容。我希望它尽可能地懒惰,以便每当它看到一个结束标记时,它会将其视为一个匹配组并重新开始。

Though, it appears to be greedy and is capturing everything within the enclosed parentheses up until the very last </tag>. I would like it to be as lazy as possible so that everytime it sees a closing tag, it will treat that as a match group and start over.

我怎么写正则表达式,以便我能够在给定的场景中获得多个匹配?

How can I write the regex so that I will be able to get multiple matches in the given scenario?

我在下面的链接中包含了我所描述的样本

I have included a sample of what I am describing in the following link

http://rubular.com/r/JW5M3rnqIE

注意:这不是XML,也不是基于任何现有的标准格式。我不需要任何复杂的东西,比如一个带有一个很好的解析器的完整库。

Note: This is not XML, nor is it really based on any existing standard format. I won't need anything sophisticated like a full-fledged library that comes with a nice parser.

推荐答案

使用正则表达式模式:

Go with regex pattern:

/<tag>(.*?)<\/tag>/im

懒惰(非贪婪)是。*?,而不是。*

Lazy (non-greedy) is .*?, not .*.

要查找多次出现,请使用:

To find multiple occurrences, use:

string.scan(/<tag>(.*?)<\/tag>/im) 

这篇关于懒惰(ungreedy)使用正则表达式匹配多个组的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆