为什么to_json在Rails 4中自动释放unicode? [英] Why does to_json escape unicode automatically in Rails 4?
问题描述
Rails 3:
{a=> < br />}。to_json
=> {\a\:\ $ Rails 4:
{a=> < br />}。to_json
=> {\a\:\\\\\
\}
为什么?
似乎导致错误
Encoding :: UndefinedConversionError:\xC3从ASCII-8BIT到UTF-8
当我的Rails 3应用程序尝试解析我的rails 4应用程序生成的JSON。
解决方案
为什么是
为了防范Web应用程序的共同弱点。如果您在HTML页面中说,例如:
< script type =text / javascript>
var something =<%= @ something.to_json.html_safe%> ;;
< / script>
那么你可能会认为你很好,因为JSON已经转义了你注入的数据进入JavaScript。但实际上你并不安全:除JSON语法之外,您还具有围绕HTML语法,并且在HTML脚本块中
是带内信令。实际上,如果 @something
包含字符串< / script>
,则会出现跨站点脚本漏洞这出来了:
< script type =text / javascript>
var something = {attack:abc< / script>< script> alert('XSS'); //};
< / script>
第一个脚本块在字符串的中间结束(留下未关闭的字符串文字语法错误),第二个脚本块< script>
被视为一个新的脚本块,并在其中执行可能的用户提交的内容。
JSON不需要将<
字符转义为 \\\<
,但它是完全有效的替代方法自动避免这类问题。如果一个JSON解析器拒绝它,这是读者中的一个严重错误。
生成错误的代码是什么?我不相信错误与<
-escaping有关,因为它在谈论字节0xC3而不是0x3C。这可能表示UTF-8编码内容的字符串未被标记为UTF-8 ...也许您需要一个 force_encoding(UTF-8)
输入?
Rails 3:
{"a" => "<br/>"}.to_json
=> "{\"a\":\"<br/>\"}"
Rails 4:
{"a" => "<br/>"}.to_json
=> "{\"a\":\"\\u003Cbr/\\u003E\"}"
WHY???
It appears to be causing the error
Encoding::UndefinedConversionError: "\xC3" from ASCII-8BIT to UTF-8
When my Rails 3 app tries to parse JSON generated by my rails 4 app.
解决方案
WHY???
To defend against a common weakness in web applications. If you say in an HTML page eg:
<script type="text/javascript">
var something = <%= @something.to_json.html_safe %>;
</script>
then you might think you're fine because you've JSON-escaped the data you're injecting into JavaScript. But actually you're not safe: aside from JSON syntax you also have surrounding HTML syntax, and in an HTML script block </
is in-band signalling. Practically, if @something
contains the string </script>
you've got a cross-site scripting vulnerability as this comes out:
<script type="text/javascript">
var something = {"attack": "abc</script><script>alert('XSS');//"};
</script>
The first script block ends halfway through the string (leaving an unclosed string literal syntax error) and the second <script>
is treated as a new script block and the potentially-user-submitted content within it executed.
Escaping the <
character to \u003C
is not required by JSON but it is a perfectly valid alternative and it automatically avoids this class of problems. If a JSON parser rejects it, that is a severe bug in the reader.
What is the code that is producing that error? I'm not convinced the error is anything to do with the <
-escaping, as it is talking about byte 0xC3 rather than 0x3C. That could be indicative of a string with UTF-8 encoded content not having been marked as UTF-8... maybe you need a force_encoding("UTF-8")
on the input?
这篇关于为什么to_json在Rails 4中自动释放unicode?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!