清理用户输入的最佳方法是什么? [英] What is the best way to sanitize user inputs?

查看:53
本文介绍了清理用户输入的最佳方法是什么?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我需要尽可能集中地防止 XSS 攻击,这样我就不必明确地清理每个输入.

I need to prevent XSS attacks as much as possible and in a centralized way so that I don't have to explicitly sanitize each input.

我的问题是在 URL/请求处理级别清理所有输入、在提供服务之前对输入进行编码/清理还是在表示级别(输出清理)更好?哪个更好,为什么?

My question is it better to sanitize all inputs at URL/Request processing level, encode/sanitize inputs before serving, or at the presentation level (output sanitization)? Which one is better and why?

推荐答案

您需要注意以下两个方面:

There are two areas where you need to be aware:

  1. 您在任何语言(尤其是 SQL)中将输入用作脚本的一部分的任何地方.在 SQL 的特殊情况下,唯一推荐的处理方式是使用参数化查询(这将导致未转义的内容在数据库中,但就像字符串一样:这是理想的).在将字符直接替换到 SQL 字符串之前,任何涉及字符的魔术引用都是低劣的(因为它很容易出错).使用参数化查询无法完成的任何事情都是针对 SQL 注入进行保护的服务绝不应允许用户指定的内容.

  1. Anywhere where you use input as part of a script in any language, most notably including SQL. In the particular case of SQL, the only recommended way of dealing with things is the use of parameterized queries (which will result in unescaped content being in the database, but just as strings: that's ideal). Anything involving the magic quoting of characters before substituting them directly into the SQL string is inferior (because it's so easy to get wrong). Anything that can't be done with a parameterized query is something that a service secured against SQL-injection should never allow a user to specify.

您将输入的内容作为输出呈现的任何地方.输入的来源可以是直接的(包括通过 cookie)或间接的(通过数据库或文件).在这种情况下,您的默认方法应该是让用户看到的文本成为输入的文本.这很容易正确实现,因为您实际需要引用的唯一字符是 <&,并且您可以将它们全部包装在 <pre> 中. 用于显示.

Anywhere where you present something that was input as output. The source of the input could be direct (including via a cookie) or indirect (via the database or a file). In this case, your default approach should be to make the text that the user sees be the text that was input. That's very easy to implement correctly since the only characters you actually have to quote are < and &, and you can wrap it all in <pre> for display.

但这通常是不够的.例如,您可能希望允许用户进行某种格式化.这就是很容易出错的地方.这种情况下最简单的方法是解析输入并检测所有格式化指令;其他一切都需要正确引用.您应该将格式化版本额外存储在数据库中作为一个额外的列,以便在将其返回给用户时不需要做太多工作,但您还应该存储用户输入的原始版本,以便您可以对其进行搜索.不要混淆它们!真的!审核您的申请,以完全确保您做对了(或者,更好的是,请其他人进行审核).

But that's often not enough. For example, you might want to allow users to do some sort of formatting. This is where it is ever so easy to go wrong. The simplest approach in this case is to parse the input and detect all the formatting instructions; everything else needs to be quoted properly. You should store the formatted version additionally in the database as an extra column so that you don't need to do much work when returning it to the user, but you should also store the original version that the user input so you can search over it. Do not mix them up! Really! Audit your application to make totally sure that you get this right (or, better yet, get someone else to do the audit).

但是关于谨慎使用 SQL 的一切仍然适用,并且有许多 HTML 标记(例如,

登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆