用正则表达式替换 str_replace [英] Str_replace with regex

查看:58
本文介绍了用正则表达式替换 str_replace的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

假设我有以下链接:

<li class="hook">
      <a href="i_have_underscores">I_have_underscores</a>
</li>

我将如何删除文本中的下划线而不是 href?我使用过 str_replace,但这会删除所有下划线,这并不理想.

How would I, remove the underscores only in the text and not the href? I have used str_replace, but this removes all underscores, which isn't ideal.

所以基本上我会留下这个输出:

So basically I would be left with this output:

<li class="hook">
      <a href="i_have_underscores">I have underscores</a>
</li>

任何帮助,非常感谢

推荐答案

使用 DOMDocument 而不是正则表达式.试试这个代码:

It's safer to parse HTML with DOMDocument instead of regex. Try this code:

<?php

function replaceInAnchors($html)
{
    $dom = new DOMDocument();
    // loadHtml() needs mb_convert_encoding() to work well with UTF-8 encoding
    $dom->loadHtml(mb_convert_encoding($html, 'HTML-ENTITIES', "UTF-8"));

    $xpath = new DOMXPath($dom);

    foreach($xpath->query('//text()[(ancestor::a)]') as $node)
    {
        $replaced = str_ireplace('_', ' ', $node->wholeText);
        $newNode  = $dom->createDocumentFragment();
        $newNode->appendXML($replaced);
        $node->parentNode->replaceChild($newNode, $node);
    }

    // get only the body tag with its contents, then trim the body tag itself to get only the original content
    return mb_substr($dom->saveXML($xpath->query('//body')->item(0)), 6, -7, "UTF-8");
}

$html = '<li class="hook">
      <a href="i_have_underscores">I_have_underscores</a>
</li>';
echo replaceInAnchors($html);

这篇关于用正则表达式替换 str_replace的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆