符合HTML5的HTML过滤器 [英] HTML filter that is HTML5 compliant

查看:357
本文介绍了符合HTML5的HTML过滤器的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

是否有简单的方法为HTMLPurifier添加HTML5规则集?

Is there a simple approach to add a HTML5 ruleset for HTMLPurifier?

HP可以是

HP can be configured to recognize new tags with:

// setup configurable HP instance
$config = HTMLPurifier_Config::createDefault();
$config->set('HTML.DefinitionID', 'html5 draft');
$config->set('HTML.DefinitionRev', 1);
$config->set('Cache.DefinitionImpl', null); // no caching
$def = $config->getHTMLDefinition(true);

// add a new tag
$form = $def->addElement(
  'article',   // name
  'Block',     // content set
  'Flow',      // allowed children
  'Common',    // attribute collection
  array(       // attributes
  )
);

// add a new attribute
$def->addAttribute('a', 'contextmenu', "ID");

然而这显然是一些工作。由于有很多新的HTML5标签和属性需要注册。即使使用现有的HTML 4标记,新的全局属性也应该是可组合的。 (很难从文档中判断如何增加核心规则)。那么,是否有更有用的配置格式/数组结构来将新的更新的标记+属性+上下文配置(inline / block / empty / flow / ..)提供给HTMLPurifier?

However this is clearly a bit of work. Since there are a lot of new HTML5 tags and attributes that had to be registered. And new global attributes should be combinable even with existing HTML 4 tags. (It's difficult to judge from the docs how to augment core rules). So, is there a more useful config format/array structure to feed new and updated tag+attribute+context configuration (inline/block/empty/flow/..) into HTMLPurifier?

# mostly confused about how to extend existing tags:
$def->addAttribute('input', 'type', "...|...|...");

# or how to allow data-* attributes (if I actually wanted that):
$def->addAttribute("data-*", ...

当然,并非所有新的HTML5标签都适合不受限制的限制HTMLPurifier完全是关于内容过滤的。它在 - < canvas> 例如在用户内容中可能没那么大,因为在没有Javascript的情况下,过滤掉),但其他标签和属性可能是不受欢迎的;所以灵活的配置结构对于启用/禁用标签及其相关属性来说是必不可少的。

And of course not all new HTML5 tags are fit for unrestricted allowance. HTMLPurifier is all about content filtering. Defining value constraints is where it's at. -- <canvas> for example might not be that big of a deal when it appears in user content. Because it's useless at best without Javascript (which HP already filters out). But other tags and attributes might be undesirable; so a flexible configuration structure is imperative for enabling/disabling tags and their associated attributes.

猜猜我应该更新一些研究...),但是仍然没有适合惠普配置的实际汇编/规范(不,XML DTD不适用)。

(Guess I should update some research...). But there's still no practical compendium/specification (no, XML DTDs aren't) that suits a HP configuration.

  • http://simon.html5.org/html-elements
  • http://www.w3.org/TR/html5-diff/#new-elements
  • http://www.w3.org/TR/html5-diff/#new-attributes

(呃,而HTML5不再是草稿)。

(Uh, and HTML5 is no longer a draft.)

推荐答案

php tidy扩展可以配置为识别html5标签。 http://tidy.sourceforge.net/docs/quickref.html#new -blocklevel-tags

The php tidy extension can be configured to recognize html5 tags. http://tidy.sourceforge.net/docs/quickref.html#new-blocklevel-tags

这篇关于符合HTML5的HTML过滤器的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆