HTML中哪些字符需要转义? [英] Which characters need to be escaped in HTML?

查看:351
本文介绍了HTML中哪些字符需要转义?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

它们是否与XML相同,也许加上空格( )?

Are they the same as XML, perhaps plus the space one ( )?

我发现了一些巨大的HTML转义字符列表,但我认为它们必须不能转义.我想知道需要要逃脱的什么.

I've found some huge lists of HTML escape characters but I don't think they must be escaped. I want to know what needs to be escaped.

推荐答案

如果要在文档中期望文本内容的位置插入文本内容 1 ,请

If you're inserting text content in your document in a location where text content is expected1, you typically only need to escape the same characters as you would in XML. Inside of an element, this just includes the entity escape ampersand & and the element delimiter less-than and greater-than signs < >:

& becomes &amp;
< becomes &lt;
> becomes &gt;

在属性值内,您还必须转义使用的引号字符:

Inside of attribute values you must also escape the quote character you're using:

" becomes &quot;
' becomes &#39;

在某些情况下,跳过其中的某些字符可能是安全的,但我建议您在所有情况下都逃避全部五个字符,以减少犯错的可能性.

In some cases it may be safe to skip escaping some of these characters, but I encourage you to escape all five in all cases to reduce the chance of making a mistake.

如果您的文档编码不支持您正在使用的所有字符,例如,如果您尝试在ASCII编码的文档中使用表情符号,则还需要转义这些字符.如今,大多数文档都是使用完全支持Unicode的UTF-8编码进行编码的,而不必这样做.

If your document encoding does not support all of the characters that you're using, such as if you're trying to use emoji in an ASCII-encoded document, you also need to escape those. Most documents these days are encoded using the fully Unicode-supporting UTF-8 encoding where this won't be necessary.

通常,您不应将空格作为&nbsp;转义. &nbsp;不是普通空间,它是不间断空间.您可以使用这些空格而不是普通空格来防止在两个单词之间插入换行符,或者在不自动折叠的情况下插入多余的空格,但这通常是一种罕见的情况.除非您有一个设计约束,否则不要这样做.

In general, you should not escape spaces as &nbsp;. &nbsp; is not a normal space, it's a non-breaking space. You can use these instead of normal spaces to prevent a line break from being inserted between two words, or to insert          extra        space       without it being automatically collapsed, but this is usually a rare case. Don't do this unless you have a design constraint that requires it.

1 所谓期望文本内容的位置",是指在元素或带引号的属性值中应用常规解析规则的位置.例如:<p>HERE</p><p title="HERE">...</p>.我在上面所写的内容不适用于,这些内容具有特殊的解析规则或含义,例如在脚本或样式标签内部,或作为元素或属性名称.例如:<NOT-HERE>...</NOT-HERE><script>NOT-HERE</script><style>NOT-HERE</style><p NOT-HERE="...">...</p>.

1 By "a location where text content is expected", I mean inside of an element or quoted attribute value where normal parsing rules apply. For example: <p>HERE</p> or <p title="HERE">...</p>. What I wrote above does not apply to content that has special parsing rules or meaning, such as inside of a script or style tag, or as an element or attribute name. For example: <NOT-HERE>...</NOT-HERE>, <script>NOT-HERE</script>, <style>NOT-HERE</style>, or <p NOT-HERE="...">...</p>.

在这些情况下,规则更复杂,并且引入安全漏洞要容易得多. 我强烈建议您不要在任何这些位置插入动态内容.我已经看到有能力的,具有安全意识的开发人员团队通过假设他们正确地编码了这些值而遗漏了一些极端情况,从而引入了漏洞.通常有一种更安全的选择,例如将动态值放入属性中,然后使用JavaScript进行处理.

In these contexts, the rules are more complicated and it's much easier to introduce a security vulnerability. I strongly discourage you from ever inserting dynamic content in any of these locations. I have seen teams of competent security-aware developers introduce vulnerabilities by assuming that they had encoded these values correctly, but missing an edge case. There's usually a safer alternative, such as putting the dynamic value in an attribute and then handling it with JavaScript.

如果必须,请阅读开放式Web应用程序安全项目的XSS预防规则,以帮助您理解一些需要牢记的问题.

If you must, please read the Open Web Application Security Project's XSS Prevention Rules to help understand some of the concerns you will need to keep in mind.

这篇关于HTML中哪些字符需要转义?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆