正则表达式匹配HTML中的表行 [英] Regex matching table rows in HTML

查看:229
本文介绍了正则表达式匹配HTML中的表行的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述


可能重复:

使用PHP解析HTML的最佳方法


我有一个麻烦匹配表行与preg。这是我的表达:

 < TR [az\ = \a-z0-9] *& [\ {\} \(\)\ ^ \ = \ $ \&安培;!\.\_\%\#\ @ \ = \< ; \> \:\; \,\〜\`\'\ * \ / \ + \ | \ [\] \ | \- a-zA-Z0-9À-ÿ\\\
\r] *)< \ / TR>

正如你所看到的,它尝试加工TR标签之间的所有内容(包括所有符号)。这部分功能非常好,但是当处理多个表行时,它通常需要多个表行作为一个匹配,而不是每个表格行的匹配:

 < TR> 
< TD>测试< / TD>
< / TR>
< TR>
< TD> test2< / TD>
< / TR>
pre>

产生:

  Array 

[0] =>< TD> test< / TD>
< TD> test2< / TD>

而不是我想要的:

  Array 

[0] =>< TD> test< / TD>
[1] => < TD> TEST2< / TD>

我意识到原因是因为它符合符号,而搜索自然需要其余的行直到最后一行。



所以基本上,我想知道有人可以帮助我添加表达式,以便它将在TR标签之间排除任何具有TR的内容,以防止它与多行匹配。

解决方案

使用懒惰匹配您的正则表达式:< tr。*?< / tr>



但正如其他人所说,如果可以,使用一个正确的解析器可靠。


Possible Duplicate:
Best methods to parse HTML with PHP

I'm having a bit of trouble matching table rows with preg. Here is my expression:

<TR[a-z\=\"a-z0-9 ]*>([\{\}\(\)\^\=\$\&\.\_\%\#\!\@\=\<\>\:\;\,\~\`\'\*\?\/\+\|\[\]\|\-a-zA-Z0-9À-ÿ\n\r ]*)<\/TR>

As you can see, it tries to mach everything in-between TR tags (including all symbols.) That part works great, however when dealing with multiple table rows, it often takes multiple table rows as ONE match, rather than a match for each table row:

<TR>
 <TD>test</TD>
</TR>
<TR>
 <TD>test2</TD>
</TR>

yields:

Array
    (
        [0] => <TD>test</TD>
               <TD>test2</TD>
    )

rather than what I want it to:

Array
    (
        [0] => <TD>test</TD>
        [1] => <TD>test2</TD>
    )

I realize that the reason for this is because it's match the symbols, and the search naturally takes the rest of the rows until it hits the last one.

So basically, I'm wondering if someone can help me add to the expression so that it will exclude anything with "TR" in between the TR tags, as to prevent it from matching multiple rows.

解决方案

Use lazy matching in your regex: <tr.*?</tr>

But as others have mentioned, it's more robust to use a proper parser if you can.

这篇关于正则表达式匹配HTML中的表行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆