需要一个正则表达式来解析HTML标签 [英] Need a regular expression to parse HTML tags
问题描述
正则表达式不是我的专长,可以在匹配和替换下列方面真正做到:
在HTML文件中,我有很多HTML实例,如下所示:
< font class = font8>文字文字文字< / font>
字体标签在单个单词或多个单词中可以有不同的内容,可以是空格,也可以是数字。 / p>
我需要找到所有这个实例并替换为:
< span class =bold>(在那里的文字)< / span>
感谢
James
PS:HTML是从word生成的,这就是为什么它很糟糕:o)使用解决方案
使用 getElementsByTagName('font')
和 DOMDocument :: loadHTML
方法并遍历基于 - >对于类名称值为,然后
createElement('span')
和 setAttribute
粗体,做一个 replaceChild
来替换它。
DOM的参考: http://php.net/manual/en/book.dom.php
Regular expressions are not my forte and could really do with assistance on matching and replacing the following:
In a HTML file I have many instances of HTML like this:
<font class=font8>text text text</font>
The font tag can have different content in either single word or multiple word with spaces and maybe numbers.
I need to find all instances of this and replace with:
<span class="bold">(text that was there)</span>
Thanks James
PS: the HTML was generated from word that is why it is so bad :o)
Use getElementsByTagName('font')
and the DOMDocument::loadHTML
method and iterate through the nodelist based in the ->length
, then createElement('span')
and setAttribute
for the class name value of bold, do a replaceChild
to replace it.
Reference for DOM: http://php.net/manual/en/book.dom.php
这篇关于需要一个正则表达式来解析HTML标签的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!