符合HTML5的HTML过滤器 [英] HTML filter that is HTML5 compliant
问题描述
是否有简单的方法为HTMLPurifier添加HTML5规则集?
Is there a simple approach to add a HTML5 ruleset for HTMLPurifier?
HP can be configured to recognize new tags with:
// setup configurable HP instance
$config = HTMLPurifier_Config::createDefault();
$config->set('HTML.DefinitionID', 'html5 draft');
$config->set('HTML.DefinitionRev', 1);
$config->set('Cache.DefinitionImpl', null); // no caching
$def = $config->getHTMLDefinition(true);
// add a new tag
$form = $def->addElement(
'article', // name
'Block', // content set
'Flow', // allowed children
'Common', // attribute collection
array( // attributes
)
);
// add a new attribute
$def->addAttribute('a', 'contextmenu', "ID");
然而这显然是一些工作。由于有很多新的HTML5标签和属性需要注册。即使使用现有的HTML 4标记,新的全局属性也应该是可组合的。 (很难从文档中判断如何增加核心规则)。那么,是否有更有用的配置格式/数组结构来将新的和更新的标记+属性+上下文配置(inline / block / empty / flow / ..)提供给HTMLPurifier?
However this is clearly a bit of work. Since there are a lot of new HTML5 tags and attributes that had to be registered. And new global attributes should be combinable even with existing HTML 4 tags. (It's difficult to judge from the docs how to augment core rules). So, is there a more useful config format/array structure to feed new and updated tag+attribute+context configuration (inline/block/empty/flow/..) into HTMLPurifier?
# mostly confused about how to extend existing tags:
$def->addAttribute('input', 'type', "...|...|...");
# or how to allow data-* attributes (if I actually wanted that):
$def->addAttribute("data-*", ...
当然,并非所有新的HTML5标签都适合不受限制的限制HTMLPurifier完全是关于内容过滤的。它在 - < canvas>
例如在用户内容中可能没那么大,因为在没有Javascript的情况下,过滤掉),但其他标签和属性可能是不受欢迎的;所以灵活的配置结构对于启用/禁用标签及其相关属性来说是必不可少的。
And of course not all new HTML5 tags are fit for unrestricted allowance. HTMLPurifier is all about content filtering. Defining value constraints is where it's at. -- <canvas>
for example might not be that big of a deal when it appears in user content. Because it's useless at best without Javascript (which HP already filters out). But other tags and attributes might be undesirable; so a flexible configuration structure is imperative for enabling/disabling tags and their associated attributes.
(猜猜我应该更新一些研究...),但是仍然没有适合惠普配置的实际汇编/规范(不,XML DTD不适用)。
(Guess I should update some research...). But there's still no practical compendium/specification (no, XML DTDs aren't) that suits a HP configuration.
- http://simon.html5.org/html-elements一>
- http://www.w3.org/ TR / html5-diff / #new-elements
- http://www.w3.org/TR/html5-diff/#new-attributes
- http://simon.html5.org/html-elements
- http://www.w3.org/TR/html5-diff/#new-elements
- http://www.w3.org/TR/html5-diff/#new-attributes
(呃,而HTML5不再是草稿)。
(Uh, and HTML5 is no longer a draft.)
推荐答案
php tidy扩展可以配置为识别html5标签。 http://tidy.sourceforge.net/docs/quickref.html#new -blocklevel-tags
The php tidy extension can be configured to recognize html5 tags. http://tidy.sourceforge.net/docs/quickref.html#new-blocklevel-tags
这篇关于符合HTML5的HTML过滤器的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!