PHP 使用 DOMXPath 去除标签和删除节点 [英] PHP Using DOMXPath to strip tags and remove nodes

查看:43
本文介绍了PHP 使用 DOMXPath 去除标签和删除节点的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试使用 DOMDocument,但遇到了一些问题.我有一个这样的字符串:

I am trying to work with DOMDocument but I am encountering some problems. I have a string like this:

Some Content to keep
<span class="ice-cts-1 ice-del" data-changedata="" data-cid="5" data-time="1414514760583" data-userid="1" data-username="Site Administrator" undefined="Site Administrator">
     This content should remain, but span around it should be stripped
</span> 
     Keep this content too
<span>
     <span class="ice-cts-1 ice-ins" data-changedata="" data-cid="2" data-time="1414512278297" data-userid="1" data-username="Site Administrator" undefined="Site Administrator">
         This whole node should be deleted
     </span>
</span>

我想要做的是,如果跨度有一个类似 ice-del 的类,保留内部内容但删除跨度标签.如果它有 ice-ins,则删除整个节点.

What I want to do is, if the span has a class like ice-del keep the inner content but remove the span tags. If it has ice-ins, remove the whole node.

如果它只是一个空跨度 也将其删除.这是我的代码:

If it is just an empty span <span></span> remove it as well. This is the code I have:

//this get the above mentioned string
$getVal = $array['body'][0][$a];
$dom = new DOMDocument;
$dom->loadHTML($getVal );
$xPath = new DOMXPath($dom);
$delNodes = $xPath->query('//span[@class="ice-cts-1 ice-del"]');
$insNodes = $xPath->query('//span[@class="ice-cts-1 ice-ins"]');

foreach($insNodes as $span){
    //reject these changes, so remove whole node
    $span->parentNode->removeChild($span);
}

foreach($delNodes as $span){
    //accept these changes, so just strip out the tags but keep the content
}

$newString = $dom->saveHTML();

所以,我的代码可以删除整个跨度节点,但是我如何获取一个节点并去掉它的标签但保留其内容?

So, my code works to delete the entire span node, but how do I take a node and strip out it tags but keep its content?

另外,我该如何删除和清空跨度?我确定我可以使用正则表达式或替换来做到这一点,但我有点想使用 dom 来做到这一点.

Also, how would I just delete and empty span? I'm sure I could do this using regex or replace but I kind of want to do this using the dom.

谢谢

推荐答案

不,我不推荐正则表达式,我强烈建议使用这个漂亮的 HTML 解析器在您现在拥有的基础上进行构建.在这种情况下,您可以使用 ->replaceChild:

No, I wouldn't recommend regex, I strongly recommend build on what you have right now with the use of this beautiful HTML Parser. You could use ->replaceChild in this case:

$dom = new DOMDocument;
$dom->loadHTML($getVal);
$xPath = new DOMXPath($dom);

$spans = $xPath->query('//span');
foreach ($spans as $span) {
    $class = $xPath->evaluate('string(./@class)', $span);
    if(strpos($class, 'ice-ins') !== false || $class == '') {
        $span->parentNode->removeChild($span);
    } elseif(strpos($class, 'ice-del') !== false) {
        $span->parentNode->replaceChild(new DOMText($span->nodeValue), $span);
    }
}

$newString = $dom->saveHTML();

这篇关于PHP 使用 DOMXPath 去除标签和删除节点的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆