% 符号在 url 中是什么意思? [英] What do % signs mean in a url?

查看:85
本文介绍了% 符号在 url 中是什么意思?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

当我复制粘贴这篇维基百科文章时,它看起来像这样.

http://en.wikipedia.org/wiki/Gruy%C3%A8re_%28cheese%29

但是,如果您将其粘贴回 URL 地址,百分号就会消失,而看起来像是 Unicode 字符(可能还有特殊的 URL 字符)的内容会取代百分号.

这些是 Unicode 和特殊 URL 字符的缩写吗?

我习惯在 JavaScript 中看到 \u00ff 等.

解决方案

您要查找的参考是 RFC 3987:国际化资源标识符,特别是关于将 IRI 映射到 URI 的部分.

RFC 3986:统一资源标识符 指定保留字符必须是 百分比编码,但它指定百分比编码的字符被解码为US-ASCII,不包含è等字符.

RFC 3987 规定非 ASCII 字符应首先编码为 UTF-8所以它们可以按照 RFC 3986 进行百分比编码.如果你允许我用 Python 来说明:

<预><代码>>>>u'è'.encode('utf-8')'\xc3\xa8'

这里我要求 Python 使用 UTF-8 将 Unicode è 编码为一串字节.返回的字节是 0xc30xa8.百分比编码,这看起来像 %C3%A8.

同样出现在您的 URL 中的括号确实适合 US-ASCII,因此它们使用 US-ASCII 代码点进行百分比转义,这也是有效的 UTF-8.

所以,不,没有简单的 16×16 表格——这样的表格永远无法代表 Unicode 的丰富性.但有一种方法可以解决明显的疯狂.

When I copy paste this Wikipedia article it looks like this.

http://en.wikipedia.org/wiki/Gruy%C3%A8re_%28cheese%29

However if you paste this back into the URL address the percent signs disappear and what appears to be Unicode characters ( and maybe special URL characters ) take the place of the percent signs.

Are these abbreviations for Unicode and special URL characters?

I'm use to seeing \u00ff, etc. in JavaScript.

解决方案

The reference you're looking for is RFC 3987: Internationalized Resource Identifiers, specifically the section on mapping IRIs to URIs.

RFC 3986: Uniform Resource Identifiers specifies that reserved characters must be percent-encoded, but it also specifies that percent-encoded characters are decoded to US-ASCII, which does not include characters such as è.

RFC 3987 specifies that non-ASCII characters should first be encoded as UTF-8 so they can be percent-encoded as per RFC 3986. If you'll permit me to illustrate in Python:

>>> u'è'.encode('utf-8')
'\xc3\xa8'

Here I've asked Python to encode the Unicode è to a string of bytes using UTF-8. The bytes returned are 0xc3 and 0xa8. Percent-encoded, this looks like %C3%A8.

The parenthesis also appearing in your URL do fit in US-ASCII, so they are percent-escaped with their US-ASCII code points, which are also valid UTF-8.

So, no, there is no simple 16×16 table—such a table could never represent the richness of Unicode. But there is a method to the apparent madness.

这篇关于% 符号在 url 中是什么意思?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆