HTML净化器:根据条件属性有条件地去除元素 [英] HTML Purifier: Removing an element conditionally based on its attributes

查看:162
本文介绍了HTML净化器:根据条件属性有条件地去除元素的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

根据 HTML Purifier smoketest ,格式不正确的URI偶尔会被丢弃到留下无属性的锚标记,例如

< a href =javascript:document.location ='http:// www。 google.com /'> XSS< / a> 变为< a> XSS< / a>< / code& b
...以及偶尔被剥离到协议,例如 $ b

< a href = http:// 1113982867 /> XSS< / a> 变为< a href =http:/> XSS< / a>



虽然这没有问题,但本质上它有点难看。我希望能够使用HTML Purifier自己的库功能/注入器/插件/ whathaveyou,而不是试图用正则表达式去掉这些。

参考点:处理属性



有条件地移除HTMLPurifier中的属性非常简单。这里库提供了类 HTMLPurifier_AttrTransform ,方法为 confiscateAttr()

>

虽然我个人不使用 confiscateAttr() 的功能,但我确实使用了 HTMLPurifier_AttrTransform 此主题 a>将 target =_ blank添加到所有锚点。

  //更多配置的东西在这里
$ htmlDef = $ htmlPurifierConfiguration-> getHTMLDefinition(true);
$ anchor = $ htmlDef-> addBlankElement('a');
$ anchor-> attr_transform_post [] = new HTMLPurifier_AttrTransform_Target();
//净化这里

HTMLPurifier_AttrTransform_Target

  class HTMLPurifier_AttrTransform_Target extends HTMLPurifier_AttrTransform 
{
public function转换($ attr,$ config,$ context){
//我可以在这里调用$ this-> confiscateAttr()抛出
//不需要的属性
$ attr [' target'] ='_blank';
返回$ attr;


这部分自然就像一个魅力一样。



处理元素



也许我不会在 HTMLPurifier_TagTransform ,或者看着错误的地方,或者通常不了解它,但我似乎无法找到有条件地移除元素的方法。



说一下:

  //更多配置
$ htmlDef = $ htmlPurifierConfiguration-> getHTMLDefinition(true);
$ anchor = $ htmlDef-> addElementHandler('a');
$ anchor-> elem_transform_post [] = new HTMLPurifier_ElementTransform_Cull();
//根据'参考点'在这里添加目标
//在这里净化

使用Cull类扩展了一些具有 confiscateElement() 的能力或可比较的能力,其中我可以检查缺少的 href 属性或 href 属性与内容 http:/

HTMLPurifier_Filter



据我所知,我可以创建一个过滤器,但示例(Youtube.php和ExtractStyleBlocks.php )建议我会使用正则表达式,而我真的不想这么做,如果它完全可能。我希望有一个板上或准板上的解决方案,它利用了HTML Purifier出色的分析功能。



返回 null 在一个 HTMLPurifier_AttrTransform 的子类中,不幸的是它并没有削减它。



任何人都有聪明的想法,或者我坚持正则表达式? :)

解决方案

成功!感谢另一个问题中的Ambush Commander和mcgrailm ,我现在正在使用一个简单的解决方案:

  //有点上下文
$ htmlDef = $ this->配置 - > ; getHTMLDefinition(真);
$ anchor = $ htmlDef-> addBlankElement('a');

// HTMLPurifier_AttrTransform_RemoveLoneHttp strips'href =http:/'from
//所有定位标记(请参阅类详细信息的第一篇文章)
$ anchor-> attr_transform_post [ ] = new HTMLPurifier_AttrTransform_RemoveLoneHttp();

//这是魔术!我们使'href'成为一个必需的属性(注意
//星号) - 现在HTML Purifier会移除< a>< / a>以及
//< a href = HTTP:/ >< / A>在HTMLPurifier_AttrTransform_RemoveLoneHttp
// //完成之后!
$ htmlDef-> addAttribute('a','href *',new HTMLPurifier_AttrDef_URI());

它可以工作,它可以工作,bahahahaHAHAHAHAnhͥͤͫğͮ͑̆ͦó̈͐̈hͧ̆̈̉ğ̈͐̈a̾̈̑ͨ̾̈̑ͨ̔̄̑̇ḡh̘̝͊̐ͩͥ̋ͤ͛g̦̣̙̙̒ͥ̐̔o̤̣hg͓̈͋̇̓̆ä͖̩̯̥͕̐ͮ̒o̶ͬ̽̍ͮ̾ͮ͢҉̩͉̘͓̙̦̩̹͍̹̠̕g̵̡͔̙͉̠̙̩͚͑ͥ̓͛̋͗̍̽͋͑̈̚... *狂躁的笑声,潺潺的声音,脸上带着微笑的龙骨*


As per the HTML Purifier smoketest, 'malformed' URIs are occasionally discarded to leave behind an attribute-less anchor tag, e.g.

<a href="javascript:document.location='http://www.google.com/'">XSS</a> becomes <a>XSS</a>

...as well as occasionally being stripped down to the protocol, e.g.

<a href="http://1113982867/">XSS</a> becomes <a href="http:/">XSS</a>

While that's unproblematic, per se, it's a bit ugly. Instead of trying to strip these out with regular expressions, I was hoping to use HTML Purifier's own library capabilities / injectors / plug-ins / whathaveyou.

Point of reference: Handling attributes

Conditionally removing an attribute in HTMLPurifier is easy. Here the library offers the class HTMLPurifier_AttrTransform with the method confiscateAttr().

While I don't personally use the functionality of confiscateAttr(), I do use an HTMLPurifier_AttrTransform as per this thread to add target="_blank" to all anchors.

// more configuration stuff up here
$htmlDef = $htmlPurifierConfiguration->getHTMLDefinition(true);
$anchor  = $htmlDef->addBlankElement('a');
$anchor->attr_transform_post[] = new HTMLPurifier_AttrTransform_Target();
// purify down here

HTMLPurifier_AttrTransform_Target is a very simple class, of course.

class HTMLPurifier_AttrTransform_Target extends HTMLPurifier_AttrTransform
{
    public function transform($attr, $config, $context) {
        // I could call $this->confiscateAttr() here to throw away an
        // undesired attribute
        $attr['target'] = '_blank';
        return $attr;
    }
}

That part works like a charm, naturally.

Handling elements

Perhaps I'm not squinting hard enough at HTMLPurifier_TagTransform, or am looking in the wrong place(s), or generally amn't understanding it, but I can't seem to figure out a way to conditionally remove elements.

Say, something to the effect of:

// more configuration stuff up here
$htmlDef = $htmlPurifierConfiguration->getHTMLDefinition(true);
$anchor  = $htmlDef->addElementHandler('a');
$anchor->elem_transform_post[] = new HTMLPurifier_ElementTransform_Cull();
// add target as per 'point of reference' here
// purify down here

With the Cull class extending something that has a confiscateElement() ability, or comparable, wherein I could check for a missing href attribute or a href attribute with the content http:/.

HTMLPurifier_Filter

I understand I could create a filter, but the examples (Youtube.php and ExtractStyleBlocks.php) suggest I'd be using regular expressions in that, which I'd really rather avoid, if it is at all possible. I'm hoping for an onboard or quasi-onboard solution that makes use of HTML Purifier's excellent parsing capabilities.

Returning null in a child-class of HTMLPurifier_AttrTransform unfortunately doesn't cut it.

Anyone have any smart ideas, or am I stuck with regexes? :)

解决方案

Success! Thanks to Ambush Commander and mcgrailm in another question, I am now using a hilariously simple solution:

// a bit of context
$htmlDef = $this->configuration->getHTMLDefinition(true);
$anchor  = $htmlDef->addBlankElement('a');

// HTMLPurifier_AttrTransform_RemoveLoneHttp strips 'href="http:/"' from
// all anchor tags (see first post for class detail)
$anchor->attr_transform_post[] = new HTMLPurifier_AttrTransform_RemoveLoneHttp();

// this is the magic! We're making 'href' a required attribute (note the
// asterisk) - now HTML Purifier removes <a></a>, as well as
// <a href="http:/"></a> after HTMLPurifier_AttrTransform_RemoveLoneHttp
// is through with it!
$htmlDef->addAttribute('a', 'href*', new HTMLPurifier_AttrDef_URI());

It works, it works, bahahahaHAHAHAHAnhͥͤͫ̀ğͮ͑̆ͦó̓̉ͬ͋h́ͧ̆̈́̉ğ̈́͐̈a̾̈́̑ͨô̔̄̑̇g̀̄h̘̝͊̐ͩͥ̋ͤ͛g̦̣̙̙̒̀ͥ̐̔ͅo̤̣hg͓̈́͋̇̓́̆a͖̩̯̥͕͂̈̐ͮ̒o̶ͬ̽̀̍ͮ̾ͮ͢҉̩͉̘͓̙̦̩̹͍̹̠̕g̵̡͔̙͉̱̠̙̩͚͑ͥ̎̓͛̋͗̍̽͋͑̈́̚...! * manic laughter, gurgling noises, keels over with a smile on her face *

这篇关于HTML净化器:根据条件属性有条件地去除元素的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆