具有多个结果的正则表达式 [英] Regular expression with multiple results

查看:243
本文介绍了具有多个结果的正则表达式的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我的正则表达式怎么了?

What's wrong with my regex ?

"/Blabla\(2\)&nbsp;:.*<tr><td class=\"generic\">(.*)<\/td>.+<\/tr>/Uis"

....

<tr>
<td class="aaa">Blabla(1)&nbsp;:</td>
<td>
<table class="bbb"><tbody>
<tr class="ccc"><th>title1</th><th>title2</th><th>title3</th></tr>
<tr><td class="generic">word1</td><td class="generic">word2 </td><td class="generic">word3</td></tr>
<tr><td class="generic">word4</td><td class="generic">word5 </td><td class="generic">word6</td></tr>
</tbody></table>
</td>
</tr>

<tr>
<td class="aaa">Blabla(2)&nbsp;:</td>
<td>
<table class="bbb"><tbody>
<tr class="ccc"><th>title1</th><th>title2</th><th>title3</th></tr>
<tr><td class="generic">word1b</td><td class="generic">word2b </td><td class="generic">word3b</td></tr>
<tr><td class="generic">word4b</td><td class="generic">word5b </td><td class="generic">word6b</td></tr>
</tbody></table>
</td>
</tr

我要做的是从以Blabla(2)开头的块中获取每个TR的FIRST TD的内容.

What I want to do is to get the content of the FIRST TD of each TR from the block beginning with Blabla(2).

所以预期的答案是word1b和word4b 但是只有第一个返回了...

So the expected answer is word1b AND word4b But only the first is returned...

感谢您的帮助.请不要回答我使用DOM导航器,在我的情况下是不可能的.

Thank you for your help. Please don't answer me to use a DOM navigator, it's not possible in my case.

推荐答案

这是一个有趣的正则表达式,在其中我了解了不愉快的标志,很好!

That's an interesting regex, in which I learned about the ungreedy flag, nice!

对于您的问题,您可以使用\G来匹配上一个匹配项和标志g后立即进行匹配,假设使用PCRE引擎:

And for your problem, you might make use of \G to match immediately after the previous match and the flag g, assuming PCRE engine:

/(?:Blabla\(2\)&nbsp;:|(?<!^)\G).*<tr><td class=\"generic\">(.*)<\/td>.+<\/tr>/Uisg

regex101演示

或更短一些的分隔符:

'~(?:Blabla\(2\)&nbsp;:|(?<!^)\G).*<tr><td class="generic">(.*)</td>.+</tr>~Uisg'

这篇关于具有多个结果的正则表达式的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆