如何检查一串HTML是否安全? [英] How to check if string of HTML is safe?

查看:116
本文介绍了如何检查一串HTML是否安全?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在我的应用程序中,我需要以字符串形式发送和接收HTML。我想保持安全,因此我需要检查字符串中的dom元素是否与允许的标签匹配,以及样式声明是否有效,以及是否没有注入脚本。想到的第一件事当然是将字符串重新编排,但这很乏味,可能是错误的,肯定效率低下。第二个想法是使用XPath,但即使我已经阅读了MDN网站上的一些资料,但我仍然不知道如何实现这个示例代码:

  const XPathResult = Components.interfaces.nsIDOMXPathResult; 

const ALLOWED_TAGS = ['div','span','b','i','u','br','font','img'];
const ALLOWED_STYLES = ['font-weight','font-size','font-family','text-decoration','color','background-color'];
const ALLOWED_ATTRIBUTES = ['style','name'];
$ b $ const XPATH_PART_TAGS = ALLOWED_TAGS.map(function(v){
returnname()!='+ v +''and name()!='+ v.toUpperCase ()+'; //不区分大小写
})。join('and');
$ b $ const XPATH_PART_ATTRS = ALLOWED_ATTRIBUTES.map(function(v){
returnname()!='+ v +''and name()!='+ v.toUpperCase ()+'; //不区分大小写
})。join('and');

$ b $ const XPATH_BAD_TAGS =// * [(namespace-uri()!='http://www.w3.org/1999/xhtml')或(+ XPATH_PART_TAGS + )];
const XPATH_BAD_ATTRIBUTES =// @ * [((namespace-uri()!='http://www.w3.org/1999/xhtml')和(namespace-uri()!='') )或(+ XPATH_PART_ATTRS +)];
const XPATH_STYLE =// @ * [name()='style'];

$ b $ ** ** b $ b *检查内联样式定义是否被认为是安全的
*
* @param {String} style属性的样式值value
* @return bool
* /
function isStyleSecure(styleValue){
var styles = styleValue.split(';'),
style,
name,价值,
i,l;
for(i = 0,l = styles.length; i style = styles [i] .trim();
if(style ===''){
continue;
}
style = style.split(':',2);
if(style.length!== 2){
return false;
}
name = style [0] .trim()。toLowerCase();
value = style [1] .trim();

if(ALLOWED_STYLES.indexOf(name)=== -1){
return false;
}
}
返回true;
}

/ **
*验证给定的XHTML文档片段是否安全的单例。
*对标签名称,属性名称和文档名称空间使用基于白名单的检查。
*
* @class
* @namespace core.SecurityFilter.MessageSecurityFilter
* /
var MessageSecurityFilter = {
/ **
*检查给定的文档片段是否安全
*
* @param {nsIDOMElement}要分析的XHTML文档片段的元素根元素
* @return {bool}如果fragment安全,则为true;否则为false
* /
isSecure:function SecurityFilter_isSecure(element){
var document = element.ownerDocument,
result,
attr;

result = document.evaluate('// *',element,null,XPathResult.ANY_TYPE,null);

result = document.evaluate(XPATH_BAD_TAGS,element,null,XPathResult.ANY_TYPE,null);
if(result.iterateNext()){
return false;
}
result = document.evaluate(XPATH_BAD_ATTRIBUTES,element,null,XPathResult.ANY_TYPE,null); ((attr = result.iterateNext())){
return false;
if((attr = result.iterateNext()))
}

result = document.evaluate(XPATH_STYLE,element,null,XPathResult.ANY_TYPE,null); ((attr = result.iterateNext())){
if(!isStyleSecure(attr.nodeValue)){
return false;
}
}

返回true;
}

};

第一个想法是创建documentFragment,然后用treeWalker或者跟随dom检查它的节点树与.firstChild等,但我想这个解决方案是不安全的,因为它会让我打开所有注入的脚本。对吗?

有没有其他方法? 解决方案

不要推出自己的消毒剂。使用由知道HTML,CSS和JS黑暗难看角落的人撰写的文章。



请参阅 http://code.google.com/p/google-caja/wiki/JsHtmlSanitizer 。 >

In my application I need to send and receive HTML in string form. I'd like to keep things safe, and because of that I need to check if dom elements in the string match allowed tags as well as if the style declarations are valid, and if there are no injected scripts. First thing that comes to mind is of course regexing the string, but this is tedious, might be buggy and for sure inefficient. Second idea is using something called XPath, but even though I've read some materials on MDN site, I still have no idea how to implement this sample code:

const XPathResult           = Components.interfaces.nsIDOMXPathResult;

const ALLOWED_TAGS          = ['div', 'span', 'b', 'i', 'u', 'br', 'font', 'img'];
const ALLOWED_STYLES        = ['font-weight', 'font-size', 'font-family', 'text-decoration', 'color', 'background-color'];
const ALLOWED_ATTRIBUTES    = ['style', 'name'];

const XPATH_PART_TAGS = ALLOWED_TAGS.map(function (v) {
    return "name() != '" + v + "' and name() != '" + v.toUpperCase() + "'"; // case insensitive
}).join(' and ');

const XPATH_PART_ATTRS = ALLOWED_ATTRIBUTES.map(function (v) {
    return "name() != '" + v + "' and name() != '" + v.toUpperCase() + "'"; // case insensitive
}).join(' and ');


const XPATH_BAD_TAGS        = "//*[(namespace-uri() != 'http://www.w3.org/1999/xhtml') or (" + XPATH_PART_TAGS + ")]";
const XPATH_BAD_ATTRIBUTES  = "//@*[((namespace-uri() != 'http://www.w3.org/1999/xhtml') and (namespace-uri() != '')) or (" + XPATH_PART_ATTRS+ ")]";
const XPATH_STYLE           = "//@*[name() = 'style']";


/**
 * Checks if inline style definition is considered secure
 *
 * @param {String} styleValue value of style attribute
 * @return bool
 */
function isStyleSecure(styleValue) {
    var styles = styleValue.split(';'),
        style,
        name, value,
        i, l;
    for (i = 0, l = styles.length; i < l; i++) {
        style = styles[i].trim();
        if (style === '') {
            continue;
        }
        style = style.split(':', 2);
        if (style.length !== 2) {
            return false;
        }
        name = style[0].trim().toLowerCase();
        value = style[1].trim();

        if (ALLOWED_STYLES.indexOf(name) === -1) {
            return false;
        }
    }
    return true;
}

/**
 * Singleton that verifies if given XHTML document fragment is considered secure.
 * Uses whitelist-based checks on tag names, attribute names and document namespaces.
 *
 * @class
 * @namespace core.SecurityFilter.MessageSecurityFilter
 */
var MessageSecurityFilter = {
    /**
     * Checks if given document fragment is safe
     *
     * @param {nsIDOMElement} element root element of the XHTML document fragment to analyze
     * @return {bool} true if fragment is safe, false otherwise
     */
    isSecure: function SecurityFilter_isSecure(element) {
        var document = element.ownerDocument,
            result,
            attr;

        result = document.evaluate('//*', element, null, XPathResult.ANY_TYPE, null);

        result = document.evaluate(XPATH_BAD_TAGS, element, null, XPathResult.ANY_TYPE, null);
        if (result.iterateNext()) {
            return false;
        }
        result = document.evaluate(XPATH_BAD_ATTRIBUTES, element, null, XPathResult.ANY_TYPE, null);
        if ((attr = result.iterateNext())) {
            return false;
        }

        result = document.evaluate(XPATH_STYLE, element, null, XPathResult.ANY_TYPE, null);
        while ((attr = result.iterateNext())) {
            if (!isStyleSecure(attr.nodeValue)) {
                return false;
            }
        }

        return true;
    }

};

And the first idea was to create documentFragment, and then check it's nodes with either treeWalker or just following dom tree with .firstChild etc. But I guess this solution is unsafe as it will leave me opened to all injected scripts. Am I right?

Is there any other way ?

解决方案

Don't roll your own sanitizer. Use one that has been written by someone who knows the dark ugly corners of HTML, CSS, and JS.

See http://code.google.com/p/google-caja/wiki/JsHtmlSanitizer for a JavaScript sanitizer.

这篇关于如何检查一串HTML是否安全?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆