如何在HTML脚本标签中插入任意JSON [英] How to insert arbitrary JSON in HTML's script tag

查看:225
本文介绍了如何在HTML脚本标签中插入任意JSON的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想将JSON的内容存储在脚本标记内的HTML文档源中.

I would like to store a JSON's contents in a HTML document's source, inside a script tag.

JSON的内容确实取决于用户提交的输入,因此需要非常小心地为XSS清理该字符串.

The content of that JSON does depend on user submitted input, thus great care is needed to sanitise that string for XSS.

我在这里已经阅读了两个概念.

I've read two concept here on SO.

1.将所有出现的</script 标记替换为< \/script ,或将所有</替换为<\/服务器端.

1. Replace all occurrences of the </script tag into <\/script, or replace all </ into <\/ server side.

明智的代码如下所示(示例使用Python和jinja2):

Code wise it looks like the following (using Python and jinja2 for the example):

// view
data = {
    'test': 'asdas</script><b>as\'da</b><b>as"da</b>',
}

context_dict = {
    'data_json': json.dumps(data, ensure_ascii=False).replace('</script', r'<\/script'),
}

// template
<script>
    var data_json = {{ data_json | safe }};
</script>

// js
access it simply as window.data_json object

2.将数据编码为HTML实体编码的JSON字符串,然后在客户端进行unescape +解析.从这个答案中逃脱出来: https://stackoverflow.com/a/34064434/518169

2. Encode the data as a HTML entity encoded JSON string, and unescape + parse it in client side. Unescape is from this answer: https://stackoverflow.com/a/34064434/518169

// view
context_dict = {
    'data_json': json.dumps(data, ensure_ascii=False),
}

// template
<script>
    var data_json = '{{ data_json }}'; // encoded into HTML entities, like &lt; &gt; &amp;
</script>

// js
function htmlDecode(input) {
  var doc = new DOMParser().parseFromString(input, "text/html");
  return doc.documentElement.textContent;
}

var decoded = htmlDecode(window.data_json);
var data_json = JSON.parse(decoded);

此方法无效,因为脚本源中的 \"在JS变量中变为了" .而且,它会创建一个更大的HTML文档,而且也不是真正的人类可读文件,因此,如果这并不意味着巨大的安全风险,我将选择第一个.

This method doesn't work because \" in a script source becames " in a JS variable. Also, it creates a much bigger HTML document and also is not really human readable, so I'd go with the first one if it doesn't mean a huge security risk.

使用第一个版本是否存在安全风险?

Is there any security risk in using the first version? Is it enough to sanitise a JSON encoded string with .replace('</script', r'<\/script')?

关于SO的参考:
将JSON存储在HTML属性中的最佳方法?
为什么要拆分< script>使用document.write()编写标签时会使用该标签吗?
JavaScript字符串中的脚本标记
消毒< script>元素内容
在脚本标记内容中转义</

Reference on SO:
Best way to store JSON in an HTML attribute?
Why split the <script> tag when writing it with document.write()?
Script tag in JavaScript string
Sanitize <script> element contents
Escape </ in script tag contents

有关此问题的一些出色的外部资源:
Flask的 tojson 过滤器的实现
Rail的 json_escape 方法的帮助
在Django 门票

Some great external resources about this issue:
Flask's tojson filter's implementation source
Rail's json_escape method's help and source
A 5 year long discussion in Django ticket and proposed code

推荐答案

在这里,我将处理此问题中相对较小的部分,即在脚本元素中存储JSON的编码问题.简短的答案是,您必须转义< /,因为它们一起终止了脚本元素-即使在JSON字符串文字中也是如此.您无法HTML编码实体以获取脚本元素.您可以JavaScript反斜杠转义斜杠.我更喜欢用JavaScript十六进制转义小于小于的括号作为 \ u003C .

Here's how I dealt with the relatively minor part of this issue, the encoding problem with storing JSON in a script element. The short answer is you have to escape either < or / as together they terminate the script element -- even inside a JSON string literal. You can't HTML-encode entities for a script element. You could JavaScript-backslash-escape the slash. I preferred to JavaScript-hex-escape the less-than angle-bracket as \u003C.

我在尝试从嵌入结果传递json时遇到了这个问题.其中一些包含脚本关闭标记(未提及Twitter (按名称).

I ran into this problem trying to pass the json from oembed results. Some of them contain script close tags (without mentioning Twitter by name).

json_for_script = json.dumps(data).replace('<', r'\u003C');

这将使 data = {'test':'foo</script>bar'}; 放入

'{"test": "foo \\u003C/script> bar"}'

这是不会终止脚本元素的有效JSON.

which is valid JSON that won't terminate a script element.

我从里面的小宝石中得到了这个主意. Jinja 模板引擎.使用 {{data | tojson}}

I got the idea from this little gem inside the Jinja template engine. It's what's run when you use the {{data|tojson}} filter.

def htmlsafe_json_dumps(obj, dumper=None, **kwargs):
    """Works exactly like :func:`dumps` but is safe for use in ``<script>``
    tags.  It accepts the same arguments and returns a JSON string.  Note that
    this is available in templates through the ``|tojson`` filter which will
    also mark the result as safe.  Due to how this function escapes certain
    characters this is safe even if used outside of ``<script>`` tags.
    The following characters are escaped in strings:
    -   ``<``
    -   ``>``
    -   ``&``
    -   ``'``
    This makes it safe to embed such strings in any place in HTML with the
    notable exception of double quoted attributes.  In that case single
    quote your attributes or HTML escape it in addition.
    """
    if dumper is None:
        dumper = json.dumps
    rv = dumper(obj, **kwargs) \
        .replace(u'<', u'\\u003c') \
        .replace(u'>', u'\\u003e') \
        .replace(u'&', u'\\u0026') \
        .replace(u"'", u'\\u0027')
    return Markup(rv)

(您可以使用 \ x3C 而不是 \ xu003C ,由于它是有效的JavaScript,因此可以在脚本元素中使用.但是也可以坚持使用有效 JSON.)

(You could use \x3C instead of \xu003C and that would work in a script element because it's valid JavaScript. But might as well stick to valid JSON.)

这篇关于如何在HTML脚本标签中插入任意JSON的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆