urlencode()'星号'(星号?)字符 [英] urlencode() the 'asterisk' (star?) character
问题描述
我正在测试 PHP urlencode()
与 Java java.net .URLEncoder.encode()
。
Java
String all = "";
for (int i = 32; i < 256; ++i) {
all += (char) i;
}
System.out.println("All characters: -||" + all + "||-");
try {
System.out.println("Encoded characters: -||" + URLEncoder.encode(all, "utf8") + "||-");
} catch (UnsupportedEncodingException e) {
e.printStackTrace();
}
PHP
$all = "";
for($i = 32; $i < 256; ++$i)
{
$all = $all.chr($i);
}
echo($all.PHP_EOL);
echo(urlencode(utf8_encode($all)).PHP_EOL);
除了'星号'字符外,所有字符似乎都以相同的方式编码。它不是由Java编码的,而是由PHP翻译成%2A。哪个行为应该是'正确的',如果有的话?
All characters seem to be encoded in the same way with both functions, except for the 'asterisk' character that is not encoded by Java, and translated to %2A by PHP. Which behaviour is supposed to be the 'right' one, if any?
注意:我试过 rawurlencode()
,也 - 没有运气。
Note: I tried with rawurlencode()
, too - no luck.
推荐答案
可以 *
在URL中,(但也可以以编码形式使用它)。
It is okay to have a *
in a URL, (but it is also okay to have it in its encoded form).
RFC1738:统一资源定位符(URL)指出以下内容:
保留:
[...]
通常是网址当八位字节是由字符表示的
和编码时,它具有相同的解释。但是,对于保留字符,这不是
true:编码为
特定方案保留的字符可能会更改URL的语义。
Usually a URL has the same interpretation when an octet is represented by a character and when it encoded. However, this is not true for reserved characters: encoding a character reserved for a particular scheme may change the semantics of a URL.
因此,只有字母数字,特殊字符$ -_。+!*'(),
,
保留字符用于他们的保留目的可以在网址中使用
未编码。
Thus, only alphanumerics, the special characters "$-_.+!*'(),"
, and
reserved characters used for their reserved purposes may be used
unencoded within a URL.
另一方面,不需要的字符要编码
(包括字母数字)可以在URL的特定于方案的
部分内编码,只要它们不用于保留的
目的。 / p>
On the other hand, characters that are not required to be encoded (including alphanumerics) may be encoded within the scheme-specific part of a URL, as long as they are not being used for a reserved purpose.
这篇关于urlencode()'星号'(星号?)字符的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!