基于JavaScript的X/HTML&CSS消毒 [英] JavaScript-based X/HTML & CSS sanitization

查看:66
本文介绍了基于JavaScript的X/HTML&CSS消毒的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在每个人都告诉我我不应该进行客户端清理(事实上,我确实打算在客户端上进行清理,尽管它也可以在SSJS中使用),让我先说明一下我想做的事情

Before everyone tells me that I shouldn't do client-side sanitization (I do in fact intend to do it on a client, though it could work in SSJS as well), let me clarify what I'm trying to do.

我想要类似 Google Caja HTMLPurifier ,但对于JavaScript:基于白名单的安全方法,用于处理HTML和CSS(当然还没有插入DOM中,这并不安全),但首先以字符串形式获得),然后有选择地过滤掉不安全的标签或属性,忽略它们或选择将它们作为转义文本包括在内,或者允许将它们报告给应用程序以进行进一步处理,最好是在上下文中.如果它也可以将任何JavaScript减少到一个安全的子集(例如在Google Caja中),那将很酷,但是我知道这会问很多.

I'd like something, akin to Google Caja or HTMLPurifier but for JavaScript: a whitelist-based security approach which processes HTML and CSS (not already inserted into the DOM of course, which would not be safe, but first obtained in string form) and then selectively filters out unsafe tags or attributes, ignoring them or optionally including them as escaped text or otherwise allowing them to be reported to the application for further processing, ideally in context. It would be cool if it could reduce any JavaScript to a safe subset as well, as in Google Caja, but I know that would be asking a lot.

我的用例是访问通过 JSONP 获得的不受信任的XML/XHTML数据(来自Mediawiki Wiki的数据Wiki处理之前,从而允许原始但不受信任的XML/HTML输入),并允许用户对数据(XQuery,jQuery,XSLT等)进行查询和转换,利用HTML5允许脱机使用,IndexedDB存储,等,然后可以在用户查看输入源并构建或导入其查询的同一页面上预览结果.

My use case is accessing untrusted XML/XHTML data obtained via JSONP (data from Mediawiki wikis before wiki processing, thereby allowing for raw but untrusted XML/HTML input) and allowing the user to make queries and transformations upon that data (XQuery, jQuery, XSLT, etc.), taking advantage of HTML5 for allowing offline use, IndexedDB storage, etc., and which can then allow the results to be previewed on the same page where the user has viewed the input source and built or imported their queries.

用户可以产生他们想要的任何输出,因此我不会对他们正在做的事情进行消毒-如果他们想将JavaScript注入到页面中,则所有功能都将发挥作用.但是我确实希望保护那些希望拥有信心的用户,他们可以添加可以安全地从不受信任的输入中复制目标元素的代码,同时禁止他们复制不安全的输入.

The user can produce whatever output they want, so I won't sanitize what they are doing--if they want to inject JavaScript into the page, all power to them. But I do want to protect users who want to have confidence that they can add code which safely copies over targeted elements from the untrusted input, while disallowing them from copying unsafe input.

这绝对应该可行,但是我想知道是否有任何图书馆已经这样做了.

This should definitely be doable, but I am wondering if there are any libraries which already do this.

如果我坚持自己执行此操作(尽管我都很好奇),我想证明是否使用 innerHTML 或DOM创建/在插入之前附加DOM该文件在各方面都是安全的.例如,如果我首先运行 DOMParser 或通过使用 innerHTML 将原始HTML附加到未插入的div上而依赖浏览器HTML解析,是否会意外触发事件?我相信应该是安全的,但不确定在插入之前是否会以某种方式发生DOM操作事件.

And if I am stuck implementing this on my own (though I'm curious in either case), I'd like to have proof about whether using innerHTML or DOM creation/appending BEFORE insertion into the document is safe in every way. For example, can events be accidentally triggered if I first ran DOMParser or relied on browser HTML parsing by using innerHTML to append raw HTML to a non-inserted div? I believe it should be safe, but not sure if DOM manipulation events could occur somehow before insertion which could be exploited.

当然,在那之后需要对构建的DOM进行清理,但是我只想验证我可以安全地构建DOM对象本身以便于遍历,然后担心过滤掉不需要的元素,属性和属性值.

Of course, the constructed DOM would need to be sanitized after that point, but I just want to verify I can safely build the DOM object itself for easier traversal and then worry about filtering out unwanted elements, attributes, and attribute values.

谢谢!

推荐答案

ESAPI的目的是提供一个简单的界面,以清晰,一致且易于使用的方式提供开发人员可能需要的所有安全功能.办法.ESAPI体系结构非常简单,只是一个类的集合,这些类封装了大多数应用程序所需的关键安全性操作.

The purpose of the ESAPI is to provide a simple interface that provides all the security functions a developer is likely to need in a clear, consistent, and easy to use way. The ESAPI architecture is very simple, just a collection of classes that encapsulate the key security operations most applications need.

OWASP ESAPI的JavaScript版本: http://code.google.com/p/owasp-esapi-js

JavaScript version of OWASP ESAPI: http://code.google.com/p/owasp-esapi-js

输入验证很难有效地完成,HTML很容易成为有史以来最糟糕的代码和数据混搭,因为可能有很多放置代码的地方,并且有许多不同的有效编码.HTML特别困难,因为它不仅是分层的,而且还包含许多不同的解析器(XML,HTML,JavaScript,VBScript,CSS,URL等).尽管输入验证很重要,应该始终执行,但它并不是针对注入攻击的完整解决方案.最好使用转义作为主要防御措施.我以前没有使用过HTML Purifier,但是它看起来不错,他们当然花了很多时间和思想.为什么不先使用他们的解决方案服务器端,然后再应用您想要的其他任何规则.我见过一些黑客,只使用 []()的组合来编写代码. XSS(跨站点脚本)速查表基于DOM的XSS预防速查表.

Input validation is extremely difficult to do effectively, HTML is easily the worst mashup of code and data of all time, as there are so many possible places to put code and so many different valid encodings. HTML is particularly difficult because it is not only hierarchical, but also contains many different parsers (XML, HTML, JavaScript, VBScript, CSS, URL, etc...). While input validation is important and should always be performed, it is not a complete solution for injection attacks. It's better to use escaping as your primary defense. I haven't used HTML Purifier before but it looks good and they certainly have put a lot of time and thought into it. Why not use their solution server side first, then apply any additional rules you'd like after that. I've seen some hacks that use nothing but combinations of [ ] ( ) to write code with. 100s of more examples here XSS (Cross Site Scripting) Cheat Sheet and The Open Web Application Security Project (OWASP). Some things to watch out for DOM based XSS Prevention Cheat Sheet.

HTML Purifier捕获了这种混合编码黑客

HTML Purifier catches this mixed encoding hack

<A HREF="h
tt  p://6&#9;6.000146.0x7.147/">XSS</A>

此DIV背景图片具有未编码的XSS漏洞

And this DIV background-image with unicoded XSS exploit

<DIV STYLE="background-image:\0075\0072\006C\0028'\006a\0061\0076\0061\0073\0063\0072\0069\0070\0074\003a\0061\006c\0065\0072\0074\0028.1027\0058.1053\0053\0027\0029'\0029">

您要面对的问题:字符<"的所有70种可能组合在HTML和JavaScript中

A bit of what you're up against: all 70 possible combinations of the character "<" in HTML and JavaScript

<
%3C
&lt
&lt;
&LT
&LT;
&#60
&#060
&#0060
&#00060
&#000060
&#0000060
&#60;
&#060;
&#0060;
&#00060;
&#000060;
&#0000060;
&#x3c
&#x03c
&#x003c
&#x0003c
&#x00003c
&#x000003c
&#x3c;
&#x03c;
&#x003c;
&#x0003c;
&#x00003c;
&#x000003c;
&#X3c
&#X03c
&#X003c
&#X0003c
&#X00003c
&#X000003c
&#X3c;
&#X03c;
&#X003c;
&#X0003c;
&#X00003c;
&#X000003c;
&#x3C
&#x03C
&#x003C
&#x0003C
&#x00003C
&#x000003C
&#x3C;
&#x03C;
&#x003C;
&#x0003C;
&#x00003C;
&#x000003C;
&#X3C
&#X03C
&#X003C
&#X0003C
&#X00003C
&#X000003C
&#X3C;
&#X03C;
&#X003C;
&#X0003C;
&#X00003C;
&#X000003C;
\x3c
\x3C
\u003c
\u003C

这篇关于基于JavaScript的X/HTML&amp;CSS消毒的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆