HTMLPurifier-允许数据属性 [英] HtmlPurifier - allow data attribute

查看:329
本文介绍了HTMLPurifier-允许数据属性的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我试图为我的所有span允许htmlPurifier使用data-attribute,但是没有办法...

我有这个字符串:

<p>
    <span data-time-start="1" data-time-end="5" id="5">
       <word class="word">My</word>
       <word class="word">Name</word>
    </span>
    <span data-time-start="6" data-time-end="15" id="88">
       <word class="word">Is</word>
       <word class="word">Zooboo</word>
    </span>
<p>

我的htmlpurifier配置:

$this->HTMLpurifierConfigInverseTransform = \HTMLPurifier_Config::createDefault();
$this->HTMLpurifierConfigInverseTransform->set('HTML.Allowed', 'span,u,strong,em');
$this->HTMLpurifierConfigInverseTransform->set('HTML.ForbiddenElements', 'word,p');
$this->HTMLpurifierConfigInverseTransform->set('CSS.AllowedProperties', 'font-weight, font-style, text-decoration');
$this->HTMLpurifierConfigInverseTransform->set('AutoFormat.RemoveEmpty', true);

我这样净化我的$value:

$purifier = new \HTMLPurifier($this->HTMLpurifierConfigInverseTransform);
var_dump($purifier->purify($value));die;

得到这个:

<span>My Name</span><span>Is Zoobo</span>

但是如何在我的span中保存我的数据属性iddata-time-startdata-time-end?

我需要这个:

<span data-time-start="1" data-time-end="5" id="5">My Name</span data-time-start="6" data-time-end="15" id="88"><span>Is Zoobo</span>

我尝试使用此配置进行测试:

$this->HTMLpurifierConfigInverseTransform->set('HTML.Allowed', 'span[data-time-start],u,strong,em');

但错误消息:

用户警告:元素"span"中的属性"data-time-start"不是 支持(有关实施此操作的信息,请参阅支持 论坛)

感谢您的帮助!

编辑1

我尝试使用以下代码行在第一时间允许ID:

$this->HTMLpurifierConfigInverseTransform->set('Attr.EnableID', true);

这对我不起作用...

编辑2

对于data-*属性,我添加了这一行,但也什么也没发生...

$def = $this->HTMLpurifierConfigInverseTransform->getHTMLDefinition(true);
$def->addAttribute('sub', 'data-time-start', 'CDATA');
$def->addAttribute('sub', 'data-time-end', 'CDATA');

解决方案

HTML Purifier知道HTML的结构,并将此知识用作其白名单过程的基础.如果您将标准属性添加到白名单,则该属性不允许该属性的任意内容-它理解该属性,但仍会拒绝没有意义的内容.

例如,如果某个地方的某个属性带有数字值,则HTML Purifier仍会拒绝尝试为该属性输入值"foo"的HTML.

如果添加自定义属性,仅将其添加到白名单中将不会教HTML Purifier如何处理这些属性:在这些属性中可以期望哪些数据?哪些数据是恶意的?

这里有大量的文档,您可以在这里告诉HTML Purifier自定义属性的结构:自定义

对于<a> -tag的'target'属性,有一个代码示例:

$config = HTMLPurifier_Config::createDefault();
$config->set('HTML.DefinitionID', 'enduser-customize.html tutorial');
$config->set('HTML.DefinitionRev', 1);
$config->set('Cache.DefinitionImpl', null); // remove this later!
$def = $config->getHTMLDefinition(true);
$def->addAttribute('a', 'target', 'Enum#_blank,_self,_target,_top');

这会将target添加为仅接受值"_blank""_self""_target""_top"的字段.这比实际的HTML定义要严格一些,但对于大多数目的来说完全足够.

这是您需要对data-time-startdata-time-end采取的一般方法.要进行可能的配置,请查看官方的HTML Purifier文档(如上链接).从您的示例中我最大的猜测是,您不想要Enum#...而是Number,就像这样……

$def->addAttribute('span', 'data-time-start', 'Number');
$def->addAttribute('span', 'data-time-end', 'Number');

...但是请检查一下,看看最适合您的用例. (正在执行此操作时,请不要忘记您还需要像当前一样在白名单中列出属性.)

对于id,您应在配置中包括 Attr.EnableID = true .

希望对您有帮助!

I'm trying to allow some data-attribute with htmlPurifier for all my span but no way...

I have this string:

<p>
    <span data-time-start="1" data-time-end="5" id="5">
       <word class="word">My</word>
       <word class="word">Name</word>
    </span>
    <span data-time-start="6" data-time-end="15" id="88">
       <word class="word">Is</word>
       <word class="word">Zooboo</word>
    </span>
<p>

My htmlpurifier config:

$this->HTMLpurifierConfigInverseTransform = \HTMLPurifier_Config::createDefault();
$this->HTMLpurifierConfigInverseTransform->set('HTML.Allowed', 'span,u,strong,em');
$this->HTMLpurifierConfigInverseTransform->set('HTML.ForbiddenElements', 'word,p');
$this->HTMLpurifierConfigInverseTransform->set('CSS.AllowedProperties', 'font-weight, font-style, text-decoration');
$this->HTMLpurifierConfigInverseTransform->set('AutoFormat.RemoveEmpty', true);

I purify my $value like this:

$purifier = new \HTMLPurifier($this->HTMLpurifierConfigInverseTransform);
var_dump($purifier->purify($value));die;

And get this :

<span>My Name</span><span>Is Zoobo</span>

But how to conserve my data attributes id, data-time-start, data-time-end in my span ?

I need to have this :

<span data-time-start="1" data-time-end="5" id="5">My Name</span data-time-start="6" data-time-end="15" id="88"><span>Is Zoobo</span>

I tried to test with this config:

$this->HTMLpurifierConfigInverseTransform->set('HTML.Allowed', 'span[data-time-start],u,strong,em');

but error message :

User Warning: Attribute 'data-time-start' in element 'span' not supported (for information on implementing this, see the support forums)

Thanks for your help !!

EDIT 1

I tried to allow ID in the firdt time with this code line:

$this->HTMLpurifierConfigInverseTransform->set('Attr.EnableID', true);

It doesn't work for me ...

EDIT 2

For data-* attributes, I add this line but nothing happened too...

$def = $this->HTMLpurifierConfigInverseTransform->getHTMLDefinition(true);
$def->addAttribute('sub', 'data-time-start', 'CDATA');
$def->addAttribute('sub', 'data-time-end', 'CDATA');

解决方案

HTML Purifier is aware of the structure of HTML and uses this knowledge as basis of its white-listing process. If you add a standard attribute to a whitelist, it doesn't allow arbitrary content for that attribute - it understands the attribute and will still reject content that makes no sense.

For example, if you had an attribute somewhere that took numeric values, HTML Purifier would still deny HTML that tried to enter the value 'foo' for that attribute.

If you add custom attributes, just adding it to the whitelist does not teach HTML Purifier how to handle the attributes: What data can it expect in those attributes? What data is malicious?

There's extensive documentation how you can tell HTML Purifier about the structure of your custom attributes here: Customize

There's a code example for the 'target' attribute of the <a>-tag:

$config = HTMLPurifier_Config::createDefault();
$config->set('HTML.DefinitionID', 'enduser-customize.html tutorial');
$config->set('HTML.DefinitionRev', 1);
$config->set('Cache.DefinitionImpl', null); // remove this later!
$def = $config->getHTMLDefinition(true);
$def->addAttribute('a', 'target', 'Enum#_blank,_self,_target,_top');

That would add target as a field that accepts only the values "_blank", "_self", "_target" and "_top". That's a bit stricter than the actual HTML definition, but for most purposes entirely sufficient.

That's the general approach you will need to take for data-time-start and data-time-end. For possible configuration, check out the official HTML Purifier documentation (as linked above). My best guess from your example is that you don't want Enum#... but Number, like this...

$def->addAttribute('span', 'data-time-start', 'Number');
$def->addAttribute('span', 'data-time-end', 'Number');

...but check it out and see what suits your use-case best. (While you're implementing this, don't forget you also need to list the attributes in the whitelist as you're currently doing.)

For id, you should include Attr.EnableID = true as part of your configuration.

I hope that helps!

这篇关于HTMLPurifier-允许数据属性的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆