encodeURIComponent真的很有用吗? [英] encodeURIComponent is really useful?
问题描述
在向服务器执行http-get请求时,我仍然不明白使用JS函数encodeURIcomponent编码http-get 的每个组件的优点是什么。
Something I still don't understand when performing an http-get request to the server is what the advantage is in using JS function encodeURIcomponent to encode each component of the http-get.
做一些测试我看到服务器(使用PHP)如果我不使用encodeURIcomponent也正确获取http-get请求的值!
显然我仍然需要在客户端级别编码特殊字符& ? = /:否则像和平&爱情=美德这样的http-get值将被视为http-get请求的新键值对而不是单个值。
但是为什么encodeURIcompenent还编码许多其他字符,比如'è',例如它被翻译成%C3%A8,必须使用utf8_decode函数在PHP服务器上解码。
Doing some tests I saw the server (using PHP) gets the values of the http-get request properly also if I don't use encodeURIcomponent! Obviously I still need to encode at client level the special character & ? = / : otherwise an http-get value like this "peace&love=virtue" would be considered as new key value pair of the http-get request instead of a one single value. But why does encodeURIcompenent encodes also many other characters like 'è' for example which is translated into %C3%A8 that must be decoded on a PHP server using the utf8_decode function.
通过使用encodeURIcomponent,http-get请求的所有值都是utf8编码的,因此当在PHP中获取它时,每次$ _GET值上的utf8_decode函数都必须调用非常烦人。
By using encodeURIcomponent all values of the http-get request are utf8 encoded, therefore when getting them in PHP I have to call each time the utf8_decode function on each $_GET value which is quite annoying.
为什么我们不能只编码& ? = /:字符?
另请参阅: JS encodeURIComponent结果与FORM创建的结果不同
它表明encodeURIComponent甚至没有正确编码,因为简单的浏览器FORM GET对字符进行编码喜欢'€',以不同的方式。所以我仍然想知道这个encodeURIComponent是什么用的?
see also: JS encodeURIComponent result different from the one created by FORM It shows that encodeURIComponent does not even encode properly because a simple browser FORM GET encodes characters like '€', in different way. So I still wonder what does this encodeURIComponent is for?
推荐答案
这是一个字符编码问题(试)。正如Gaby所说,URI是一系列ASCII字符(因此只有0-127范围内的字节)。因此,任何其他不是ASCII的字符都需要使用百分比编码进行编码。
This is a character encoding issue (again). As Gaby stated, URIs are a sequence of ASCII characters (thus only bytes of the range 0–127). So any other character, that is not in ASCII, needs to be encoded with the Percent-Encoding.
由于UTF-8是新的通用字符编码,现在用户代理将URI解释为UTF-8编码。但是这些UTF-8编码的单词本身也使用Percent-Encoding进行编码,因为URI不能包含ASCII以外的任何其他字符。
And since UTF-8 is the new "universal character encoding", nowadays user agents interpret the URI to be UTF-8 encoded. But these UTF-8 encoded words are themselves also encoded with the Percent-Encoding since URIs cannot contain any other characters except those in ASCII.
这意味着,当你输入 http://en.wikipedia.org/wiki/ $$$
进入浏览器的地址栏,浏览器查找UTF-8代码€
(0xE282AC)并对其应用百分比编码(%E2%82%AC
)。所以 http://en.wikipedia.org/wiki/ $$$
实际上会产生 http://en.wikipedia.org/wiki/% E2%82%AC
。
That means, when you enter http://en.wikipedia.org/wiki/€
into your browser’s address field, your browser looks up the UTF-8 code for €
(0xE282AC) and applies the Percent-Encoding on it (%E2%82%AC
). So http://en.wikipedia.org/wiki/€
will actually result in http://en.wikipedia.org/wiki/%E2%82%AC
.
要告诉您这是真的,只需输入 http:// en。 wikipedia.org/wiki/%E2%82%AC
进入您的地址栏,您的浏览器可能会将其转换为 http://en.wikipedia.org/wiki/ $$$
。这是因为现在用户代理将URI解释为UTF-8编码。
To show you that this is true, just enter http://en.wikipedia.org/wiki/%E2%82%AC
into your address field and your browser will probably turn that into http://en.wikipedia.org/wiki/€
. That is because nowadays user agents interpret the URI to be UTF-8 encoded.
现在回到初始问题,为什么你应该明确地应用Percent-Encoding:想象一下有一个网页,您想要链接到欧元符号上的维基百科文章。如果你只是用普通的€
编写URI:
Now back to your initial question, why you should apply the Percent-Encoding explicitly: Imagine you have a web page where you want to link to the Wikipedia article on the Euro sign. If you just write the URI with a plain €
:
<a href="http://en.wikipedia.org/wiki/€">Euro sign</a>
您的浏览器将使用文档的字符编码€
字符。这意味着,如果您的文档编码为Windows-1252(如您的其他问题),则€
将编码为0x80,URI将为 http://en.wikipedia.org/wiki/%80
(这实际上有效,因为维基百科是这很聪明,因为Windows-1252是最受欢迎的字符编码,在0x80上有可打印的字符。)
Your browser will use the character encoding of the document for the €
character. That means, if your document’s encoding is Windows-1252 (as in your other question), the €
will be encoded as 0x80 and the URI would be http://en.wikipedia.org/wiki/%80
(this actually works because Wikipedia is that clever to guess as Windows-1252 is the most popular character encoding with a printable character on 0x80).
但是如果您的文档的编码是ISO 8859-15,那么€
将编码为0xA4,代表货币符号ISO 8859-1中的¤
(维基百科将选择ISO 8859-1,因为0xA4是UTF-8中的无效字节序列, HTTP将ISO 8859-1指定为默认字符编码。
But if your document’s encoding is ISO 8859-15, the €
will be encoded as 0xA4 that represents the currency sign ¤
in ISO 8859-1 (Wikipedia will chose ISO 8859-1 because 0xA4 is an invalid byte sequence in UTF-8 and HTTP specifies ISO 8859-1 as default character encoding).
所以我建议始终使用Percent-Encoding来避免使用mista KES 即可。不要让用户代理猜出你的意思。
So I recommend to always use the Percent-Encoding to avoid mistakes. Don’t let the user agents guess what you mean.
这篇关于encodeURIComponent真的很有用吗?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!