如何使用Python避免在JSON中的HTML标记中关闭'/'? [英] How do I escape closing '/' in HTML tags in JSON with Python?

查看:158
本文介绍了如何使用Python避免在JSON中的HTML标记中关闭'/'?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

注意:这个问题与在脚本标签中嵌入JSON对象非常相似,但对该问题的回答提供了我已经知道的信息(在JSON / == \ / 中) 。我想知道如何进行转义。

Note: This question is very close to Embedding JSON objects in script tags, but the responses to that question provides what I already know (that in JSON / == \/). I want to know how to do that escaping.

HTML规范禁止在< script> 元素。因此,这会导致解析错误:

The HTML spec prohibits closed HTML tags anywhere within a <script> element. So, this causes parse errors:

<script>
var assets = [{
  "asset_created": null, 
  "asset_id": "575155948f7d4c4ebccb02d4e8f84d2f", 
  "body": "<script></script>"
}];
</script>

在我的情况下,我通过在Django模板中渲染JSON字符串来生成无效情况,即:

In my case, I'm generating the invalid situation by rendering a JSON string inside a Django template, i.e.:

<script>
var assets = {{ json_string }};
</script>

我知道JSON解析 \ / / 相同,因此,如果我可以在JSON字符串中转义结束的HTML标记,那将会很好。但是,我不确定执行此操作的最佳方法。

I know that JSON parses \/ the same as /, so if I can just escape my closing HTML tags in the JSON string, I'll be good. But, I'm not sure of the best way to do this.

我天真的做法就是这样:

My naive approach would just be this:

json_string = '[{"asset_created": null, "asset_id": "575155948f7d4c4ebccb02d4e8f84d2f", "body": "<script></script>"}]'
escaped_json_string = json_string.replace('</', r'<\/')

有更好的方法吗?还是我忽略的任何陷阱?

Is there a better way? Or any gotchas that I'm overlooking?

推荐答案

更新后的答案

好吧,我错误地假设了一些事情。为了转义JSON, simplejson 库具有方法 JSONEncoderForHTML 可以使用。如果代码不起作用,则可能需要通过 pip easy_install 进行安装。然后您可以执行以下操作:

Okay I assumed a few things incorrectly. For escaping the JSON, the simplejson library has a method JSONEncoderForHTML than can be used. You may need to install it via pip or easy_install if the code doesn't work. Then you can do something like this:

import simplejson
asset_json=simplejson.loads(json_string)
encoded=simplejson.encoder.JSONEncoderForHTML().encode(assets_json)

其中编码将为您提供:

'{"asset_id": "575155948f7d4c4ebccb02d4e8f84d2f", "body": "\\u003cscript\\u003e\\u003c/script\\u003e", "asset_created": null}'

这是比斜杠替换更全面的解决方案,因为它也可以处理其他编码警告。

This is a more overall solution than the slash replace as it handles other encoding caveats as well.

loads 部分是已经对JSON进行编码的副作用。可以通过不使用DJango(如果可能的话)来生成JSON来避免,而可以使用simplejson:

The loads part is a side-effect of having the JSON already encoded. This can be avoided by not using DJango if possible to generate the JSON and instead using simplejson:

simplejson.dumps(your_object_to_encode, cls=simplejson.encoder.JSONEncoderForHTML)

旧答案

尝试将脚本包装在 CDATA

<script>
//<![CDATA[
var assets = [{
  "asset_created": null, 
  "asset_id": "575155948f7d4c4ebccb02d4e8f84d2f", 
  "body": "<script></script>"
}];
//]]>
</script>

这是为了在此类情况下标记解析器。否则,您将需要使用前面提到的字符转义符。

It's meant to flag the parser on this sort of thing. Otherwise you'll need to use the character escapes that have been mentioned.

这篇关于如何使用Python避免在JSON中的HTML标记中关闭'/'?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆