哪些字符需要在HTML上转义? [英] Which characters need to be escaped on HTML?
问题描述
它们是否与XML相同,或许加上空格之一(&它)?
我发现了一些巨大的HTML转义字符列表,他们认为他们必须逃脱。我想知道什么需要才能被转义。 解决方案
需要在文本中转义与XML相同的内容 [规范] [ doc ] :
&变成& amp; amp; amp;
<变成& lt;
>变成& gt;
在属性值中,您还必须转义引号字符 [规范] :
变成& quot;
'变成&#39;
如果您的文档是ASCII或其他非Unicode编码,并且您使用的字符不受支持,则需要将其转义。否则,您很好 1 。
您通常不希望将空格转义为& nbsp;
。& nbsp;
不是一个普通的空间,它是一个不间断的空间 [ wiki ] 。您可以使用这些而不是普通空格来防止在两个单词之间插入换行符,或插入 &NBSP;&NBSP;&NBSP;&NBSP;&NBSP;额外&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;空间&NBSP;&NBSP;&NBSP;&N bsp; 没有它会自动折叠,但你不需要经常这样做。 b
b
1 你没问题,只要你将转义文本插入某处,它可以插入普通文本(即不在< style>
或< script>
标签内,而不在属性值内)。否则,您必须采取其他预防措施,如 daxelrod的回答和由Open Web Application Security Project 描述。
Are they the same as XML, perhaps plus the space one (&nbsp;)?
I've found some huge lists of HTML escape characters but I don't think they must be escaped. I want to know what needs to be escaped.
If your document is unicode, you only need to escape the same ones as for XML in your text [spec] [doc]:
& becomes &
< becomes <
> becomes >
In attribute values you must also escape the quote character [spec]:
" becomes "
' becomes '
If your document is ASCII or another non-Unicode encoding and you're using characters that aren't supported, you'll need to escape them. Otherwise, you're fine1.
You usually do not want to escape spaces as
.
is not a normal space, it's a non-breaking space [wiki]. You can use these instead of normal spaces to prevent a line break from being inserted between two words, or to insert extra space without it being automatically collapsed, but you won't need to do this very often.
1 You're fine, as long as you're inserting the escaped text somewhere that it makes sense to insert ordinary text (i.e. not inside a <style>
or <script>
tag, and not inside an attribute value). Otherwise you must take other precautions as mentioned in daxelrod's answer and described here by the Open Web Application Security Project.
这篇关于哪些字符需要在HTML上转义?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!