为什么ASCII字符集存在URL编码 [英] Why does URL encoding exist for ASCII character set
问题描述
在 W3Schools 中明确指出,
URL只能通过Internet使用ASCII字符集发送。
URLs can only be sent over the Internet using the ASCII character-set.
为什么当a,b,c这样的ASCII字符存在可以通过互联网发送而不用URL编码的URL编码?
Why does URL encoding exist for ASCII characters like a , b , c when it can be sent over the internet without any URL encoding ???
例如:为什么要编码' a'可以作为'a'发送
Eg: Why encode 'a' when it can send over as 'a'
编码ASCII字符的可能原因是什么?我可以想到的唯一原因是黑客谁试图使他们的URL尽可能不可读的进行XSS攻击
What are the possible reasons to encode ASCII characters ?? The only reason i can think of are hackers who are trying to make their URL as unreadable as possible to carry out XSS attacks
推荐答案
STD 66,百分号编码:
百分比编码机制用于表示组件中的数据八位位组,当该八位字节的相应字符在允许的集合外面或被用作分隔符组件。
A percent-encoding mechanism is used to represent a data octet in a component when that octet's corresponding character is outside the allowed set or is being used as a delimiter of, or within, the component.
所以百分号编码是一种转义机制:一些字符在URI组件中有特殊的含义(→他们是保留)。如果你想使用这样一个字符,没有特殊的含义,你可以对它进行百分号编码。
So percent-encoding is a kind of escape mechanism: Some characters have a special meaning in URI components (→ they are reserved). If you want to use such a character without it’s special meaning, you percent-encode it.
未保留的字符,如 a
, b
, c
,...可以直接使用,但也可以对它们进行百分比编码。这样的URI将是等效的:
Unreserved characters like a
, b
, c
, … can always be used directly, but it’s also allowed to percent-encode them. Such URIs would be equivalent:
将未保留字符替换为相应的百分号编码的US-ASCII八位字节的URI不同,它们是相同的:它们标识相同的资源。
URIs that differ in the replacement of an unreserved character with its corresponding percent-encoded US-ASCII octet are equivalent: they identify the same resource.
为什么首先允许百分编码未保留的字符? 过时的RFC 2396 包含(由我粗体显示):
Why it’s allowed to percent-encode unreserved characters in the first place? The obsolete RFC 2396 contains (bold by me):
可以转义未保留的字符而不改变URI的语义,但这不应该是,除非URI在不允许使用的上下文中使用未转义的字符出现。
我不能想到这样一个上下文的例子,但这句话建议可能有一些。
I can’t think of an example for such a "context", but this sentence suggests that there may be some.
另外,也许可能一些人/实现喜欢简单地百分号编码一切 (除了分隔符等),所以他们不必检查是否/哪些字符在相应的组件中需要百分号编码。
Also, maybe some people/implementations like to simply percent-encode everything (except for delimiters etc.), so they don’t have to check if/which characters would need percent-encoding in the corresponding component.
这篇关于为什么ASCII字符集存在URL编码的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!