需要一个正则表达式来解析HTML标签 [英] Need a regular expression to parse HTML tags

查看:106
本文介绍了需要一个正则表达式来解析HTML标签的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

正则表达式不是我的专长,可以在匹配和替换下列方面真正做到:

在HTML文件中,我有很多HTML实例,如下所示:

 < font class = font8>文字文字文字< / font> 

字体标签在单个单词或多个单词中可以有不同的内容,可以是空格,也可以是数字。 / p>

我需要找到所有这个实例并替换为:

  < span class =bold>(在那里的文字)< / span> 

感谢
James



PS:HTML是从word生成的,这就是为什么它很糟糕:o)使用解决方案使用 getElementsByTagName('font') DOMDocument :: loadHTML 方法并遍历基于 - >对于类名称值为,然后 createElement('span') setAttribute 粗体,做一个 replaceChild 来替换它。



DOM的参考: http://php.net/manual/en/book.dom.php


Regular expressions are not my forte and could really do with assistance on matching and replacing the following:

In a HTML file I have many instances of HTML like this:

<font class=font8>text text text</font>

The font tag can have different content in either single word or multiple word with spaces and maybe numbers.

I need to find all instances of this and replace with:

<span class="bold">(text that was there)</span>

Thanks James

PS: the HTML was generated from word that is why it is so bad :o)

解决方案

Use getElementsByTagName('font') and the DOMDocument::loadHTML method and iterate through the nodelist based in the ->length, then createElement('span') and setAttribute for the class name value of bold, do a replaceChild to replace it.

Reference for DOM: http://php.net/manual/en/book.dom.php

这篇关于需要一个正则表达式来解析HTML标签的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆