在< script>内的JavaScript字符串文字中转义HTML实体块 [英] Escaping HTML entities in JavaScript string literals within the <script> block
问题描述
一方面,如果我有
<script>
var s = 'Hello </script>';
console.log(s);
</script>
浏览器将终止< script>
早点阻止,基本上我得到的页面搞砸了。
the browser will terminate the <script>
block early and basically I get the page screwed up.
另一方面,字符串的值可能来自用户(例如,通过以前提交的表单,现在字符串最后插入到< script>
块作为文字),所以你可以期望该字符串中的任何内容,包括恶意形成的标签。现在,如果我在生成页面时使用htmlentities()避免了字符串文字,那么s的值将包含转义的实体,即s将输出
On the other hand, the value of the string may come from a user (say, via a previously submitted form, and now the string ends up being inserted into a <script>
block as a literal), so you can expect anything in that string, including maliciously formed tags. Now, if I escape the string literal with htmlentities() when generating the page, the value of s will contain the escaped entities literally, i.e. s will output
Hello </script>
在这种情况下,这是不需要的行为。
which is not desired behavior in this case.
正确转义< script>
块中的JS字符串的一种方法是转义斜杠,如果它遵循左尖括号,或者只是总是逃避斜杠,即
One way of properly escaping JS strings within a <script>
block is escaping the slash if it follows the left angle bracket, or just always escaping the slash, i.e.
var s = 'Hello <\/script>';
这似乎工作正常。
然后在HTML事件处理程序中出现JS代码的问题,这可以很容易地被破坏,例如
Then comes the question of JS code within HTML event handlers, which can be easily broken too, e.g.
<div onClick="alert('Hello ">')"></div>
大部分(或全部)浏览器首先打破,这显然需要完整的HTML实体编码。
looks valid at first but breaks in most (or all?) browsers. This, obviously requires the full HTML entity encoding.
我的问题是:正确的最佳/标准做法是什么涵盖上述所有情况 - 即脚本块内的JS,事件处理程序中的JS - 如果您的JS代码部分可以在服务器端生成并且可能包含恶意数据?
My question is: what is the best/standard practice for properly covering all the situations above - i.e. JS within a script block, JS within event handlers - if your JS code can partly be generated on the server side and can potentially contain malicious data?
推荐答案
以下字符可能会干扰HTML或Javascript解析器,并应在字符串文字中转义:<> ,',\,
和&
。
在一个使用转义字符的脚本块中,正如您所发现的那样。连接方法(< / scr'+'ipt>'
)可能很难阅读。
In a script block using the escape character, as you found out, works. The concatenation method (</scr' + 'ipt>'
) can be hard to read.
var s = 'Hello <\/script>';
对于HTML中的内联JavaScript,您可以使用实体:
For inline Javascript in HTML, you can use entities:
<div onClick="alert('Hello ">')">click me</div>
演示: http://jsfiddle.net/ThinkingStiff/67RZH/
在< script>
块和内联Javascript是 \uxxxx
,其中 xxxx
是十六进制字符代码。
The method that works in both <script>
blocks and inline Javascript is \uxxxx
, where xxxx
is the hexadecimal character code.
-
\\\<
-
>
-\\\>
-
-
\\\"
-
'
-\\\'
-
\
-\
-
&
-\\\&
<
-\u003c
>
-\u003e
"
-\u0022
'
-\u0027
\
-\u005c
&
-\u0026
演示: http://jsfiddle.net/ThinkingStiff/Vz8n7/
HTML:
<div onClick="alert('Hello \u0022>')">click me</div>
<script>
var s = 'Hello \u003c/script\u003e';
alert( s );
</script>
这篇关于在< script>内的JavaScript字符串文字中转义HTML实体块的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!