.NET HTML白名单(防XSS /跨站脚本) [英] .NET HTML whitelisting (anti-xss/Cross Site Scripting)

查看:662
本文介绍了.NET HTML白名单(防XSS /跨站脚本)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有,我已经有了一个使用HTML的一个子集(输入,TinyMCE的)用户输入的常见情况。我需要有一些服务器端的保护,防止XSS攻击,并在寻找人们使用这样做一个良好的测试工具。在PHP端我看到很多像HTMLPurifier来完成这个任务的库,但我似乎无法找到.NET什么。

I've got the common situation where I've got user input that uses a subset of HTML (input with tinyMCE). I need to have some server-side protection against XSS attacks and am looking for a well-tested tool that people are using to do this. On the PHP side I'm seeing lots of libraries like HTMLPurifier that do the job, but I can't seem to find anything in .NET.

我基本上找一个库筛选下来的标记白名单,属性上的标签,并做正确的事与困难的属性,如:href和IMG:SRC

I'm basically looking for a library to filter down to a whitelist of tags, attributes on those tags, and does the right thing with "difficult" attributes like a:href and img:src

我已经看到了杰夫·阿特伍德的帖子在<一个href="http://refactormy$c$c.com/$c$cs/333-sanitize-html">http://refactormy$c$c.com/$c$cs/333-sanitize-html,但我不知道怎么了最新它。它有没有在所有任何影响到目前,该网站使用的是什么?在任何情况下,我不知道我很舒服的努力正则表达式出有效输入的战略。

I've seen Jeff Atwood's post at http://refactormycode.com/codes/333-sanitize-html, but I don't know how up-to-date it is. Does it have any bearing at all to what the site is currently using? And in any case I'm not sure I'm comfortable with that strategy of trying to regexp out valid input.

本博客文章中提出了似乎是一个更引人注目的策略:

This blog post lays out what seems to be a much more compelling strategy:

<一个href="http://blog.bvsoftware.com/post/2009/01/08/How-to-filter-Html-Input-to-$p$pvent-Cross-Site-Scripting-but-Still-Allow-Design.aspx">http://blog.bvsoftware.com/post/2009/01/08/How-to-filter-Html-Input-to-$p$pvent-Cross-Site-Scripting-but-Still-Allow-Design.aspx

这方法是实际解析HTML到DOM,验证,然后从中重建有效的HTML。如果HTML解析可以处理的HTML格式不正确理智,那也不错。如果没有,没什么大不了的 - 我可以要求结构良好的HTML,因为用户应该使用TinyMCE的编辑器。在这两种情况下,我重写我所知道的是安全的,结构良好的HTML。

This method is to actually parse the HTML into a DOM, validate that, then rebuild valid HTML from it. If the HTML parsing can handle malformed HTML sensibly, then great. If not, no big deal -- I can demand well-formed HTML since the users should be using the tinyMCE editor. In either case I'm rewriting what I know is safe, well-formed HTML.

现在的问题是,这只是说明,没有一个链接到实际执行的算法任何库。

The problem is that's just a description, without a link to any library that actually executes that algorithm.

有没有这样的库是否存在?如果不是这样,这将是一个很好的.NET HTML解析引擎?什么正规的前pressions应该用来执行额外验证一:HREF,IMG:源?我失去了别的东西在这里很重要?

Does such a library exist? If not, what would be a good .NET HTML parsing engine? And what regular expressions should be used to perform extra validation a:href, img:src? Am I missing something else important here?

我不想再实现一个马车轮子在这里。当然还有一些常用的库在那里。任何想法?

I don't want re-implement a buggy wheel here. Surely there's some commonly used libraries out there. Any ideas?

推荐答案

那么,如果要分析,而你担心无效的(X)HTML进来那么的 HTML敏捷性包可能是用于分析的最好的事情。请记住,虽然它不只是元素,而且还属性你需要允许(当然你应该努力元素的允许白名单和他们的属性,而不是试图剥夺的东西,可能是通过一个黑名单狡猾的)允许的元素。

Well if you want to parse, and you're worried about invalid (x)HTML coming in then the HTML Agility Pack is probably the best thing to use for parsing. Remember though it's not just elements, but also attributes on allowed elements you need to allow (of course you should work to an allowed whitelist of elements and their attributes, rather than try to strip things that might be dodgy via a blacklist)

另外还有 OWASP AntiSamy项目这是一个正在进行的工作正在进行中 - 他们也有一个测试网站你可以尝试XSS

There's also the OWASP AntiSamy Project which is an ongoing work in progress - they also have a test site you can try to XSS

正则表达式这可能是太冒险了IMO。

Regex for this is probably too risky IMO.

这篇关于.NET HTML白名单(防XSS /跨站脚本)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆