清理html字符串中的所有脚本 [英] Sanitize all scripts from html string

查看:92
本文介绍了清理html字符串中的所有脚本的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述



用户正在将text / html粘贴到我的网页中。

HTML5剪贴板非常棒,但我正在寻找一种安全的方法。这允许他们粘贴图像,表等。



我正在寻找一种方法来删除粘贴内容中的所有脚本,然后将其添加到页面中。 / p>

我需要删除< script> 元素以及其他执行脚本的方式,如

 < img src =xonerror =alert('Hacked!')> 

(和任何其他人)



I 不要要移除样式元素或任何其他种类的元素。 (它们实际上粘贴到一个iframe中,所以样式不会影响其他任何内容。) Google Caja 删除恶意JavaScript - 您甚至可以如果需要,可以使用它去除所有JavaScript 内容。然而,我质疑你的目标。您的目标是防止自我XSS?除非您在某处输出HTML,否则用户不会感到危险。如果您将HTML输出到同一用户,并且除粘贴之外还有其他输入内容的方法,则应确保您保护页面不受CSRF影响。这将阻止攻击者在当前用户的授权下插入他们自己的恶意JavaScript。



如果您将HTML输出给其他用户,您可能希望清理内容服务器侧。如果HTML内容完全不被允许,那么输出时应该进行HTML编码,因此< script> 标签将显示为< script> ,而不是被浏览器解释为代码块。



如果你需要输出HTML,但没有脚本,你应该清理它服务器端,您还应该实施内容安全策略。使用正确的策略,您可以防止内联脚本在现代浏览器中运行。 CSP将防止您选择的卫生洗涤剂中发现的任何未来错误对用户构成威胁。 支持的浏览器在此处详细介绍



您提及您想要支持样式 - 请注意, CSS样式表也可以包含代码。这是一个支持Internet Explorer的概念(和 版本的FireFox)。但是,如果您不允许内联样式,那么您的CSP应该阻止此操作。


The HTML5 clipboard is awesome, but I am looking for a way to make it safe.

The user is pasting text/html into my webpage. This allows them to paste images, tables, etc.

I am looking for a way to remove all scripts from the pasted content, before I add it to the page.

I need to remove <script> elements, as well as other ways of executing scripts like

<img src="x" onerror="alert('Hacked!')">

(and any others)

I do not want to remove style elements, or any other sorts of elements. (They are actually pasting into an iframe, so styles won't affect anything else.)

解决方案

You could use a sanitizer like Google Caja to remove malicious JavaScript - you could even use it to strip all JavaScript content if desired.

However, I question your goals. Is your aim to prevent self-XSS? Unless you output the HTML somewhere, there is no danger to the user. If you output the HTML to the same user and there are other methods of entering the content other than paste, then you should make sure you protect the page against CSRF. This would stop an attacker inserting their own malicious JavaScript under the authorisation of the current user.

If you output the HTML to other users, you may wish to sanitize the content server side. If HTML content isn't allowed at all then you should HTML encode when output so a <script> tag will display as <script> in the browser rather than being interpreted as a code block by the browser.

If you need to output HTML, but without scripts you should sanitize it server side and you should also implement a Content Security Policy. With the correct policy you can prevent inline scripts from running at all in modern browsers. The CSP will prevent any future bugs found in your chosen sanitizer from posing a threat to the user. Supported browsers are detailed here.

You mention that you want to support styles - note that CSS stylesheets can also contain code. This is an Internet Explorer supported concept (and old versions of FireFox). However, your CSP should prevent this if you disallow inline styles.

这篇关于清理html字符串中的所有脚本的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆