用于渲染html的子集的Django templatetag [英] Django templatetag for rendering a subset of html

查看:135
本文介绍了用于渲染html的子集的Django templatetag的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一些html(在这种情况下是通过TinyMCE创建的),我想添加到一个页面。但是,出于安全考虑,我不想打印用户输入的所有内容。

I have some html (in this case created via TinyMCE) that I would like to add to a page. However, for security reason, I don't want to just print everything the user has entered.

有谁知道一个模板(最好是一个过滤器)只有一个html的安全子集才能呈现?

Does anyone know of a templatetag (a filter, preferably) that will allow only a safe subset of html to be rendered?

我意识到降价等等。但是,他们还添加了额外的标记语法,这可能会让我的用户感到困惑,因为他们使用的是不熟悉标记的富文本编辑器。

I realize that markdown and others do this. However, they also add additional markup syntax which could be confusing for my users, since they are using a rich text editor that doesn't know about markdown.

推荐答案

移除标签,但是这是一个黑名单的方法,当它们看起来不像Django所期望的格式良好的标签时,无法删除标签,当然,由于它不会尝试删除属性,因此完全容易受到其他1,000种脚本注入的攻击不涉及< script> 标签。这是一个陷阱,提供安全的幻觉,实际上根本不提供真正的安全性。

There's removetags, but it's a blacklisting approach which fails to remove tags when they don't look exactly like the well-formed tags Django expects, and of course since it doesn't attempt to remove attributes it is totally vulnerable to the 1,000 other ways of script-injection that don't involve the <script> tag. It's a trap, offering the illusion of safety whilst actually providing no real security at all.

基于正则表达式黑客攻击的HTML-sanicationation方法几乎不可避免地是完全失败的。使用真正的HTML解析器获取提交的内容的对象模型,然后以已知的格式过滤和重新序列化,通常是最可靠的方法。

HTML-sanitisation approaches based on regex hacking are almost inevitably a total fail. Using a real HTML parser to get an object model for the submitted content, then filtering and re-serialising in a known-good format, is generally the most reliable approach.

如果您的富文本编辑器输出XHTML很简单,只需使用minidom或etree来解析文档,然后再遍历除去已知的所有元素和属性,最后转换为安全的XML。另一方面,如果它吐出HTML,或者允许用户输入原始HTML,则可能需要使用像BeautifulSoup这样的东西。有关讨论,请参阅此问题

If your rich text editor outputs XHTML it's easy, just use minidom or etree to parse the document then walk over it removing all but known-good elements and attributes and finally convert back to safe XML. If, on the other hand, it spits out HTML, or allows the user to input raw HTML, you may need to use something like BeautifulSoup on it. See this question for some discussion.

过滤HTML是一个大而复杂的主题,这就是为什么许多人喜欢使用限制性文字的语言。

Filtering HTML is a large and complicated topic, which is why many people prefer the text-with-restrictive-markup languages.

这篇关于用于渲染html的子集的Django templatetag的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆