用PHP中的htmlspecialchars()替换所有的html标签吗? [英] replace all but certain html tags with htmlspecialchars() in PHP?

查看:109
本文介绍了用PHP中的htmlspecialchars()替换所有的html标签吗?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想处理我的用户输入以仅允许某些html标记,并用其html实体替换其他标记,并替换非标记字符。例如,如果我只想要允许< b> < a> 标签, / p>

  allow_only(这是< b>粗体< / b>这是< i>斜体< / i> ;. 
此外,2 <3和

应该产生

 这是< b>粗体< / b>这是< i& gt;斜体& lt; / i& gt ;. 
此外,2& lt; 3和< a href ='google.com'>这是一个连结< / a> ;.

如何在PHP中执行此操作?我知道 strip_tags()可以完全去除不需要的标签,并且我知道 htmlspecialchars() 可以用它们的html实体替换所有标签,但没有一个只有特定标签被替换。如何做到这一点在PHP?



如果没有'共同'的方式来做到这一点,我应该如何一般去处理用户输入,可以有效正常的html,但也可以有< 标志和潜在危险的html代码?

应用htmlspecialchars,然后用给定的标签数组的常规实体替换已编码的实体

  function allow_only($ str, $允许){
$ str = htmlspecialchars($ str);
foreach($ allowed as $ a){
$ str = str_replace(& lt;。$ a。& gt;,<。$ a。> ,$ str);
$ str = str_replace(& lt; /。$ a。& gt;,< /。$ a。>,$ str);
}
返回$ str;
}
echo allow_only(这是< b>粗体< / b>并且这是< i>斜体< / i>。,array(b));

适用于简单标签,返回This is bold ,这是< i>斜体< / i>。



正如它指出的那样,对于具有属性的标签无效,但确实如此:

 函数fix_attributes($ match){
return<。$ match [1] .str_replace('& quot; ';';'match'[2])。>;
}
function allow_only($ str,$ allowed){
$ str = htmlspecialchars($ str);
foreach($允许为$ a){
$ str = preg_replace_callback(/& lt;(。$ a。){1}([\ s \ / \。 \ w =& ;;:#] *?)& gt; /,fix_attributes,$ str);
$ str = str_replace(& lt; /。$ a。& gt ;,< /。$ a。>,$ str);
}
return $ str;
}
echo allow_only('This is< ; b>加粗< / b>和< a href =http://www.#links>此< / a>是< i>斜体< / i& a));

可处理更复杂的标签对于某些属性,只有在 [] 之间列出的字符才允许出现在属性中。不幸的是,& quot; 必须被允许在属性中,否则它将不起作用,并且允许所有其他实体 - 但是只有&



因为有人建议用一种更好(更安全,更清洁)的方法来解决像这样的问题,图书馆,如 http://htmlpurifier.org/demo.php


I would like to process my user input to allow only certain html tags, and replace the other ones by their html entities, as well as replace non-tag-characters. For example, if I only wanted to allow the <b> and the <a> tag, then

allow_only("This is <b>bold</b> and this is <i>italic</i>.
            Moreover 2<3 and <a href='google.com'>this is a link</a>.","<b><a>");

should produce

This is <b>bold</b> and this is &lt;i&gt;italic&lt;/i&gt;.
Moreover 2&lt;3 and <a href='google.com'>this is a link</a>.

How can I do this in PHP? I am aware of strip_tags() that can remove the unwanted tags completely, and I'm aware of htmlspecialchars() which can replace all tags by their html entities, but none where only specific tags get replaced. How can this be done in PHP?

And if there is no 'common' way to do this, how should I in general go on processing user input that can have valid regular html, but can also have < signs and potentially dangerous html code?

解决方案

Apply htmlspecialchars and then replace encoded entities with regular entities for a given array of tags

function allow_only($str, $allowed){
    $str = htmlspecialchars($str);
    foreach( $allowed as $a ){
        $str = str_replace("&lt;".$a."&gt;", "<".$a.">", $str);
        $str = str_replace("&lt;/".$a."&gt;", "</".$a.">", $str);
    }
    return $str;
}
echo allow_only("This is <b>bold</b> and this is <i>italic</i>.", array("b"));

That works for simple tags, returning "This is bold and this is <i>italic</i>."

As it was pointed out, that doesn't work for tags with attributes, but this does:

function fix_attributes($match){
    return "<".$match[1].str_replace('&quot;','"',$match[2]).">";
}
function allow_only($str, $allowed){
    $str = htmlspecialchars($str);
    foreach( $allowed as $a ){
        $str = preg_replace_callback("/&lt;(".$a."){1}([\s\/\.\w=&;:#]*?)&gt;/", fix_attributes, $str);
        $str = str_replace("&lt;/".$a."&gt;", "</".$a.">", $str);
    }
    return $str;
}
echo allow_only('This is <b>bold</b> and <a href="http://www.#links">this</a> is <i>italic</i>.', array("b","a"));

that handles more complex tags with certain attributes, only the characters listed between [] are allowed to appear in attributes by this. Unfortunately &quot; must be allowed within attributes or it won't work, and with it all other entities are allowed too - however only &quot in attributes will be decoded.

As it was suggested a much better (safer, cleaner) way to solve problems like this to use a library like http://htmlpurifier.org/demo.php

这篇关于用PHP中的htmlspecialchars()替换所有的html标签吗?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆