HTML净化器:根据条件属性有条件地去除元素 [英] HTML Purifier: Removing an element conditionally based on its attributes
问题描述
根据 HTML Purifier smoketest ,格式不正确的URI偶尔会被丢弃到留下无属性的锚标记,例如
虽然这没有问题,但本质上它有点难看。我希望能够使用HTML Purifier自己的库功能/注入器/插件/ whathaveyou,而不是试图用正则表达式去掉这些。 有条件地移除HTMLPurifier中的属性非常简单。这里库提供了类 虽然我个人不使用 这部分自然就像一个魅力一样。 也许我不会在 说一下: 使用Cull类扩展了一些具有 据我所知,我可以创建一个过滤器,但示例(Youtube.php和ExtractStyleBlocks.php )建议我会使用正则表达式,而我真的不想这么做,如果它完全可能。我希望有一个板上或准板上的解决方案,它利用了HTML Purifier出色的分析功能。 返回 任何人都有聪明的想法,或者我坚持正则表达式? :) 成功!感谢另一个问题中的Ambush Commander和mcgrailm ,我现在正在使用一个简单的解决方案: 它可以工作,它可以工作,bahahahaHAHAHAHAnhͥͤͫğͮ͑̆ͦó̈͐̈hͧ̆̈̉ğ̈͐̈a̾̈̑ͨ̾̈̑ͨ̔̄̑̇ḡh̘̝͊̐ͩͥ̋ͤ͛g̦̣̙̙̒ͥ̐̔o̤̣hg͓̈͋̇̓̆ä͖̩̯̥͕̐ͮ̒o̶ͬ̽̍ͮ̾ͮ͢҉̩͉̘͓̙̦̩̹͍̹̠̕g̵̡͔̙͉̠̙̩͚͑ͥ̓͛̋͗̍̽͋͑̈̚... ! *狂躁的笑声,潺潺的声音,脸上带着微笑的龙骨* As per the HTML Purifier smoketest, 'malformed' URIs are occasionally discarded to leave behind an attribute-less anchor tag, e.g. ...as well as occasionally being stripped down to the protocol, e.g. While that's unproblematic, per se, it's a bit ugly. Instead of trying to strip these out with regular expressions, I was hoping to use HTML Purifier's own library capabilities / injectors / plug-ins / whathaveyou. Conditionally removing an attribute in HTMLPurifier is easy. Here the library offers the class While I don't personally use the functionality of That part works like a charm, naturally. Perhaps I'm not squinting hard enough at Say, something to the effect of: With the Cull class extending something that has a I understand I could create a filter, but the examples (Youtube.php and ExtractStyleBlocks.php) suggest I'd be using regular expressions in that, which I'd really rather avoid, if it is at all possible. I'm hoping for an onboard or quasi-onboard solution that makes use of HTML Purifier's excellent parsing capabilities. Returning Anyone have any smart ideas, or am I stuck with regexes? :) Success! Thanks to Ambush Commander and mcgrailm in another question, I am now using a hilariously simple solution: It works, it works, bahahahaHAHAHAHAnhͥͤͫ̀ğͮ͑̆ͦó̓̉ͬ͋h́ͧ̆̈́̉ğ̈́͐̈a̾̈́̑ͨô̔̄̑̇g̀̄h̘̝͊̐ͩͥ̋ͤ͛g̦̣̙̙̒̀ͥ̐̔ͅo̤̣hg͓̈́͋̇̓́̆a͖̩̯̥͕͂̈̐ͮ̒o̶ͬ̽̀̍ͮ̾ͮ͢҉̩͉̘͓̙̦̩̹͍̹̠̕g̵̡͔̙͉̱̠̙̩͚͑ͥ̎̓͛̋͗̍̽͋͑̈́̚...! * manic laughter, gurgling noises, keels over with a smile on her face * 这篇关于HTML净化器:根据条件属性有条件地去除元素的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!< a href =javascript:document.location ='http:// www。 google.com /'> XSS< / a>
变为< a> XSS< / a>< / code& b
...以及偶尔被剥离到协议,例如 $ b < a href = http:// 1113982867 /> XSS< / a>
变为< a href =http:/> XSS< / a>
参考点:处理属性
HTMLPurifier_AttrTransform
,方法为 confiscateAttr()
。
confiscateAttr()
的功能,但我确实使用了 HTMLPurifier_AttrTransform
按此主题 a>将 target =_ blank
添加到所有锚点。
//更多配置的东西在这里
$ htmlDef = $ htmlPurifierConfiguration-> getHTMLDefinition(true);
$ anchor = $ htmlDef-> addBlankElement('a');
$ anchor-> attr_transform_post [] = new HTMLPurifier_AttrTransform_Target();
//净化这里
HTMLPurifier_AttrTransform_Target $ c $
class HTMLPurifier_AttrTransform_Target extends HTMLPurifier_AttrTransform
{
public function转换($ attr,$ config,$ context){
//我可以在这里调用$ this-> confiscateAttr()抛出
//不需要的属性
$ attr [' target'] ='_blank';
返回$ attr;
处理元素
HTMLPurifier_TagTransform
,或者看着错误的地方,或者通常不了解它,但我似乎无法找到有条件地移除元素的方法。
//更多配置
$ htmlDef = $ htmlPurifierConfiguration-> getHTMLDefinition(true);
$ anchor = $ htmlDef-> addElementHandler('a');
$ anchor-> elem_transform_post [] = new HTMLPurifier_ElementTransform_Cull();
//根据'参考点'在这里添加目标
//在这里净化
confiscateElement()
的能力或可比较的能力,其中我可以检查缺少的 href
属性或 href
属性与内容 http:/
。
HTMLPurifier_Filter
null
在一个 HTMLPurifier_AttrTransform
的子类中,不幸的是它并没有削减它。
//有点上下文
$ htmlDef = $ this->配置 - > ; getHTMLDefinition(真);
$ anchor = $ htmlDef-> addBlankElement('a');
// HTMLPurifier_AttrTransform_RemoveLoneHttp strips'href =http:/'from
//所有定位标记(请参阅类详细信息的第一篇文章)
$ anchor-> attr_transform_post [ ] = new HTMLPurifier_AttrTransform_RemoveLoneHttp();
//这是魔术!我们使'href'成为一个必需的属性(注意
//星号) - 现在HTML Purifier会移除< a>< / a>以及
//< a href = HTTP:/ >< / A>在HTMLPurifier_AttrTransform_RemoveLoneHttp
// //完成之后!
$ htmlDef-> addAttribute('a','href *',new HTMLPurifier_AttrDef_URI());
<a href="javascript:document.location='http://www.google.com/'">XSS</a>
becomes <a>XSS</a>
<a href="http://1113982867/">XSS</a>
becomes <a href="http:/">XSS</a>
Point of reference: Handling attributes
HTMLPurifier_AttrTransform
with the method confiscateAttr()
.confiscateAttr()
, I do use an HTMLPurifier_AttrTransform
as per this thread to add target="_blank"
to all anchors.// more configuration stuff up here
$htmlDef = $htmlPurifierConfiguration->getHTMLDefinition(true);
$anchor = $htmlDef->addBlankElement('a');
$anchor->attr_transform_post[] = new HTMLPurifier_AttrTransform_Target();
// purify down here
HTMLPurifier_AttrTransform_Target
is a very simple class, of course.class HTMLPurifier_AttrTransform_Target extends HTMLPurifier_AttrTransform
{
public function transform($attr, $config, $context) {
// I could call $this->confiscateAttr() here to throw away an
// undesired attribute
$attr['target'] = '_blank';
return $attr;
}
}
Handling elements
HTMLPurifier_TagTransform
, or am looking in the wrong place(s), or generally amn't understanding it, but I can't seem to figure out a way to conditionally remove elements.// more configuration stuff up here
$htmlDef = $htmlPurifierConfiguration->getHTMLDefinition(true);
$anchor = $htmlDef->addElementHandler('a');
$anchor->elem_transform_post[] = new HTMLPurifier_ElementTransform_Cull();
// add target as per 'point of reference' here
// purify down here
confiscateElement()
ability, or comparable, wherein I could check for a missing href
attribute or a href
attribute with the content http:/
.HTMLPurifier_Filter
null
in a child-class of HTMLPurifier_AttrTransform
unfortunately doesn't cut it.// a bit of context
$htmlDef = $this->configuration->getHTMLDefinition(true);
$anchor = $htmlDef->addBlankElement('a');
// HTMLPurifier_AttrTransform_RemoveLoneHttp strips 'href="http:/"' from
// all anchor tags (see first post for class detail)
$anchor->attr_transform_post[] = new HTMLPurifier_AttrTransform_RemoveLoneHttp();
// this is the magic! We're making 'href' a required attribute (note the
// asterisk) - now HTML Purifier removes <a></a>, as well as
// <a href="http:/"></a> after HTMLPurifier_AttrTransform_RemoveLoneHttp
// is through with it!
$htmlDef->addAttribute('a', 'href*', new HTMLPurifier_AttrDef_URI());