输入与输出的 HTML/XSS 转义 [英] HTML/XSS escape on input vs output

查看:45
本文介绍了输入与输出的 HTML/XSS 转义的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

从我所看到的一切来看,似乎在用户输入的内容上转义 html 的约定(为了防止 XSS)是在呈现内容时执行此操作.大多数模板语言似乎默认都这样做,我遇到过诸如 this stackoverflow answer 认为这个逻辑是表示层的工作.

From everything I've seen, it seems like the convention for escaping html on user-entered content (for the purposes of preventing XSS) is to do it when rendering content. Most templating languages seem to do it by default, and I've come across things like this stackoverflow answer arguing that this logic is the job of the presentation layer.

所以我的问题是,为什么会这样?对我来说,逃避输入(即表单或模型验证)似乎更清晰,因此您可以假设数据库中的任何内容都可以安全地显示在页面上,原因如下:

So my question is, why is this the case? To me it seems cleaner to escape on input (i.e. form or model validation) so you can work under the assumption that anything in the database is safe to display on a page, for the following reasons:

  1. 各种输出格式 - 对于现代 Web 应用程序,您可能会使用服务器端 html 渲染、使用 AJAX/JSON 的 JavaScript Web 应用程序和接收 JSON 的移动应用程序(可能或可能没有一些 webviews,可能是 JavaScript 应用程序或服务器渲染的 html).所以你必须处理到处转义的html.但是输入在保存到数据库之前总是会被实例化为模型(并经过验证),并且您的模型都可以从同一个基类继承.

  1. Variety of output formats - for a modern web app, you may be using a combination of server-side html rendering, a JavaScript web app using AJAX/JSON, and mobile app that receives JSON (and which may or may not have some webviews, which may be JavaScript apps or server-rendered html). So you have to deal with html escaping all over the place. But input will always get instantiated as a model (and validated) before being saved to db, and your models can all inherit from the same base class.

您已经必须小心输入以防止代码注入攻击(当然这通常被抽象为 ORM 或 db 游标,但仍然如此),那么为什么不也担心 html 在这里转义,这样您就不用了?不必担心任何与输出安全相关的问题?

You already have to be careful about input to prevent code-injection attacks (granted this is usually abstracted to the ORM or db cursor, but still), so why not also worry about html escaping here so you don't have to worry about anything security-related on output?

我很想听听关于为什么首选在页面渲染中转义 html 的争论

I would love to hear the arguments as to why html escaping on page render is preferred

推荐答案

最初的误解

不要将输出的卫生与验证混淆.

The original misconception

Do not confuse sanitation of output with validation.

虽然 <script>alert(1);</script> 是一个完全有效的用户名,但在网站上显示之前肯定必须对其进行转义.

While <script>alert(1);</script> is a perfectly valid username, it definitely must be escaped before showing on the website.

是的,存在表示逻辑"这样的东西,它与域业务逻辑"无关.并且表示表示逻辑是表示层处理的内容.尤其是 View 实例.在编写良好的 MVC 中,Views 是成熟的对象(与 RoR 试图告诉您的相反),当应用于 Web 上下文时,它会处理多个模板.

And yes, there is such a thing as "presentation logic", which is not related to "domain business logic". And said presentation logic is what presentation layer deals with. And the View instances in particular. In a well written MVC, Views are full-blown objects (contrary to what RoR would try to to tell you), which, when applied in web context, juggle multiple templates.

不同的输出格式应该由不同的视图处理.控制 HTML、XML、JSON 和其他格式的规则和限制在每种情况下都不同.

Different output formats should be handled by different views. The rules and restrictions, which govern HTML, XML, JSON and other formats, are different in each case.

您始终需要存储原始输入(如果您不使用准备好的语句,则进行消毒以避免注入),因为有人可能需要在某个时候对其进行编辑.

You always need to store the original input (sanitized to avoid injections, if you are not using prepared statements), because someone might need to edit it at some point.

存储原始版本和 xss 安全的公共"版本是浪费.如果您想存储清理过的输出,因为每次清理它都需要太多资源,那么您已经在对错误的树感到不满.这是一种情况,当您使用缓存时,而不是污染数据库.

And storing original and the xss-safe "public" version is waste. If you want to store sanitized output, because it takes too much resources to sanitize it each time, then you are already pissing at the wrong tree. This is a case, when you use cache, instead of polluting the database.

这篇关于输入与输出的 HTML/XSS 转义的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆