如何去除字符串中的特定标签和特定属性? [英] How to strip specific tags and specific attributes from a string?
本文介绍了如何去除字符串中的特定标签和特定属性?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
在批准的HTML标记中,还想删除有害的属性。如 onload
和 onmouseover
。此外,根据白名单 。
我想过正则表达式,但我很确定它是邪恶,对工作没什么帮助。
任何人都可以给我一个正确的方向吗?
提前致谢。
图1.
解决方案
require_once'library / HTMLPurifier。 auto.php;
$ config = HTMLPurifier_Config :: createDefault();
//这个是需要的,否则就会导致
//被认为是有害的,像input一样会被自动删除
$ config-> set('HTML.Trusted',true) ;
//这行代表只有input,p,div会被接受
$ config-> set('HTML.AllowedElements','input,p,div');
//为每个标记设置属性
$ config-> set('HTML.AllowedAttributes','input.type,input.name,p.id,div.style') ;
//更广泛的管理属性和元素的方式...查看文档
// http://htmlpurifier.org/live/configdoc/plain.html
$ def = $ config-> getHTMLDefinition(true);
$ def-> addAttribute('input','type','Enum#text');
$ def-> addAttribute('input','name','Text');
//调用...
$ purifier = new HTMLPurifier($ config);
//显示...
$ html = $ purifier-> purify($ raw_html);
- 注意: strong>,因为您询问此代码将以白名单运行,只接受输入,p和div,并且只接受某些特定属性。
Here's the deal, I'm making a project to help teach HTML to people. Naturally, I'm afraid of that Scumbag Steve (see figure 1).
So I wanted to block ALL HTML tags, except those approved on a very specific whitelist.
Out of those approved HTML tags, I want to remove harmful attributes as well. Such as onload
and onmouseover
. Also, according to a whitelist.
I've thought of regex, but I'm pretty sure it's evil and not very helpful for the job.
Could anyone give me a nudge in the right direction?
Thanks in advance.
Fig 1.
解决方案
require_once 'library/HTMLPurifier.auto.php';
$config = HTMLPurifier_Config::createDefault();
// this one is needed cause otherwise stuff
// considered harmful like input's will automatically be deleted
$config->set('HTML.Trusted', true);
// this line say that only input, p, div will be accepted
$config->set('HTML.AllowedElements', 'input,p,div');
// set attributes for each tag
$config->set('HTML.AllowedAttributes', 'input.type,input.name,p.id,div.style');
// more extensive way of manage attribute and elements... see the docs
// http://htmlpurifier.org/live/configdoc/plain.html
$def = $config->getHTMLDefinition(true);
$def->addAttribute('input', 'type', 'Enum#text');
$def->addAttribute('input', 'name', 'Text');
// call...
$purifier = new HTMLPurifier($config);
// display...
$html = $purifier->purify($raw_html);
- NOTE: as you asked this code will run as a Whitelist, only input, p and div are accepted and only certains attributes are accepted.
这篇关于如何去除字符串中的特定标签和特定属性?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文