我如何使用PHP删除空文本节点的标签? [英] How can I use php to remove tags with empty text node?

查看:78
本文介绍了我如何使用PHP删除空文本节点的标签?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

如何使用php删除空文本节点的标签?



例如,

< div class =box>< / div> 删除

< ; a href =#>< / a> 删除

< p>< a href =#>< / a>< / p> 删除

< span style =...>< / span> remove



但是我想用这样的文本节点来保存标签



< a href =#>链接< / a> 保持

编辑:



我想删除这样的东西,

 < p>< strong>< a href =http://xx.org.uk/dartmoor-arts>< / a> ;< /强>< / p为H. 
< p>< strong>< a href =http://xx.org.uk/depw>< / a>< / strong>< / p>
< p>< strong>< a href =http://xx.org.uk/devon-guild-of-craftsmen>< / a>< / strong>< / p为H.

我测试了两个正则表达式,

  $ content = preg_replace('!<(。*?)[^>]> \s *< / \\ 1>!','',$ content ); 
$ content = preg_replace('%<(。*?)[^>] *> \\\s *< / \\1>%','',$ content) ;

但他们留下这样的东西,

 < p为H.;<强>< /强>< / p为H. 
< p>< strong>< / strong>< / p>
< p>< strong>< / strong>< / p>


解决方案

一种方法可能是:

  $ dom = new DOMDocument(); 
$ dom-> loadHtml(
'< p>< strong>< a href =http://xx.org.uk/dartmoor-arts>测试< / a> ;< / strong>>< / p>
< p>< strong>< a href =http://xx.org.uk/depw>< / a>< / strong>< / p>
< p>< strong>< a href =http://xx.org.uk/devon-guild-of-craftsmen>< / a> ;< / strong>< / p>'
);

$ xpath = new DOMXPath($ dom); $()
$ b while(($ nodeList = $ xpath-> query('// * [not(text())and not(node())]'))&& $ nodeList- >长度> 0){
foreach($ nodeList as $ node){
$ node-> parentNode-> removeChild($ node);
}
}

echo $ dom-> saveHtml();

可能您必须稍微改变一下您的需求。


How can I use php to remove tags with empty text node?

For instance,

<div class="box"></div> remove

<a href="#"></a> remove

<p><a href="#"></a></p> remove

<span style="..."></span> remove

But I want to keep the tag with text node like this,

<a href="#">link</a> keep

Edit:

I want to remove something messy like this too,

<p><strong><a href="http://xx.org.uk/dartmoor-arts"></a></strong></p>
<p><strong><a href="http://xx.org.uk/depw"></a></strong></p>
<p><strong><a href="http://xx.org.uk/devon-guild-of-craftsmen"></a></strong></p>

I tested both regex below,

$content = preg_replace('!<(.*?)[^>]*>\s*</\1>!','',$content);
$content = preg_replace('%<(.*?)[^>]*>\\s*</\\1>%', '', $content);

But they leave something like this,

<p><strong></strong></p>
<p><strong></strong></p>
<p><strong></strong></p>

解决方案

One way could be:

$dom = new DOMDocument();
$dom->loadHtml(
    '<p><strong><a href="http://xx.org.uk/dartmoor-arts">test</a></strong></p>
    <p><strong><a href="http://xx.org.uk/depw"></a></strong></p>
    <p><strong><a href="http://xx.org.uk/devon-guild-of-craftsmen"></a></strong></p>'
);

$xpath = new DOMXPath($dom);

while(($nodeList = $xpath->query('//*[not(text()) and not(node())]')) && $nodeList->length > 0) {
    foreach ($nodeList as $node) {
        $node->parentNode->removeChild($node);
    }
}

echo $dom->saveHtml();

Probably you'll have to change that a bit for your needs.

这篇关于我如何使用PHP删除空文本节点的标签?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆