基于白名单编写(X)HTML的XSS过滤器 [英] Writing XSS Filter for (X)HTML Based on White List

查看:287
本文介绍了基于白名单编写(X)HTML的XSS过滤器的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我需要在C ++中实现一个简单高效的XSS过滤器,用于 CppCMS 。我不能使用PHP中现有的高质量过滤器
,因为它是使用C ++的高性能框架。

基本思想是提供一个过滤器,有一个HTML标签的列表和一个白色
这些标签的选项列表。例如。典型的HTML输入可以由
< b> < i> ,tags和< a> 标记与 href 。但是直截了当的实现并不是
好​​,因为甚至允许简单的链接可能包括XSS:

 < a href =javascript:alert('XSS')>点击我< / a> 

还有很多其他的例子可以找到那里。所以我也有可能为像href / src这样的标签创建一个白名单前缀 - 所以我总是需要检查它是否以(https?| ftp)开头://










$ b $这些假设对于大多数目的来说足够好吗?意思是说,如果我不
给出 style 标签的选项并且使用前缀的白名单检查src / href它解决了XSS问题?是否存在无法通过这种方式修复的问题?

  • 为了编写简单的
    解析器,是否有一个很好的HTML / XHTML形式语法参考,可以清除所有不正确的禁止使用< script>


  • 解决方案

    你可以看看Anti Samy项目,试图完成同样的事情。虽然它是Java和.NET。



    编辑1,额外的一点:

    您可能会想出一个非常严格的白名单。它应该是结构良好,应该是相当紧张,不太灵活。当你结合灵活性,如此多的标签,属性和不同的浏览器通常会导致XSS漏洞。



    我不知道你的要求是什么,去严格和简单的标签支持(只有b li h1等),然后严格的基于标签的属性支持(例如src只在href标签下有效),那么你需要在属性值中做白名单http:https | ftp或style =color | background-color等等。

    考虑这个:

    < x style =express / ** / ion:(alert(/ bah!/))>

    还需要考虑一些字符白名单或一些UTF-8规范化,因为不同的编码可能会导致一些尴尬的问题。如属性中的新行,无效的UTF-8序列。

    I need to implement a simple and efficient XSS Filter in C++ for CppCMS. I can't use existing high quality filters written in PHP because because it is high performance framework that uses C++.

    The basic idea is provide a filter that have a while list of HTML tags and a white list of options for these tags. For example. typical HTML input can consist of <b>, <i>, tags and <a> tag with href. But straightforward implementation is not good enough, because, even allowed simple links may include XSS:

    <a href="javascript:alert('XSS')">Click On Me</a>
    

    There are many other examples can be found there. So I though also about a possibility to create a white list of prefixes for tags like href/src -- so I always need to check if it starts with (https?|ftp)://

    Questions:

    • Are these assumptions are good enough for most of purposes? Meaning that If I do not give an options for style tags and check src/href using white list of prefixes it solves XSS problems? Are there problems that can't be fixes this way?
    • Is there a good reference for formal grammar of HTML/XHTML in order to write simple parser that would cleanup all incorrect of forbidden tags like <script>

    解决方案

    You can take a look at the Anti Samy project, trying to accomplish the same thing. It's Java and .NET though.

    Edit 1, A bit extra :

    You can potentially come up with a very strict white listing. It should be structured well and should be pretty tight and not much flexible. When you combine flexibility, so many tags, attributes and different browsers generally you end up with a XSS vulnerability.

    I don't know what is your requirements but I'd go with a strict and simple tag support (only b li h1 etc.) and then strict attribute support based on the tag (for example src is only valid under href tag), then you need to do whitelisting in the attribute values as you stated http|https|ftp or style="color|background-color" etc.

    Consider this one:

    <x style="express/**/ion:(alert(/bah!/))">

    Also you need to think about some character whitelisting or some UTF-8 normalization, because different encodings can cause awkward issues. Such as new lines in attributes, non valid UTF-8 sequences.

    这篇关于基于白名单编写(X)HTML的XSS过滤器的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

    查看全文
    登录 关闭
    扫码关注1秒登录
    发送“验证码”获取 | 15天全站免登陆