Java 相当于产生相同输出的 JavaScript 的 encodeURIComponent 吗? [英] Java equivalent to JavaScript's encodeURIComponent that produces identical output?
问题描述
我一直在尝试各种 Java 代码,试图想出一些东西来编码包含引号、空格和异国情调"Unicode 字符的字符串,并产生与 JavaScript 的 encodeURIComponent 函数.
I've been experimenting with various bits of Java code trying to come up with something that will encode a string containing quotes, spaces and "exotic" Unicode characters and produce output that's identical to JavaScript's encodeURIComponent function.
我的酷刑测试字符串是:"A" B ± "
My torture test string is: "A" B ± "
如果我在 Firebug 中输入以下 JavaScript 语句:
If I enter the following JavaScript statement in Firebug:
encodeURIComponent('"A" B ± "');
—然后我得到:
—Then I get:
"%22A%22%20B%20%C2%B1%20%22"
这是我的小测试 Java 程序:
Here's my little test Java program:
import java.io.UnsupportedEncodingException;
import java.net.URLEncoder;
public class EncodingTest
{
public static void main(String[] args) throws UnsupportedEncodingException
{
String s = ""A" B ± "";
System.out.println("URLEncoder.encode returns "
+ URLEncoder.encode(s, "UTF-8"));
System.out.println("getBytes returns "
+ new String(s.getBytes("UTF-8"), "ISO-8859-1"));
}
}
—这个程序输出:
—This program outputs:
URLEncoder.encode returns %22A%22+B+%C2%B1+%22
getBytes returns "A" B ± "
关闭,但没有雪茄!使用 Java 对 UTF-8 字符串进行编码以使其产生与 JavaScript 的 encodeURIComponent
相同的输出的最佳方法是什么?
Close, but no cigar! What is the best way of encoding a UTF-8 string using Java so that it produces the same output as JavaScript's encodeURIComponent
?
我正在使用 Java 1.4,很快就会迁移到 Java 5.
I'm using Java 1.4 moving to Java 5 shortly.
推荐答案
查看实现差异,我看到:
Looking at the implementation differences, I see that:
- 文字字符(正则表达式):
[-a-zA-Z0-9._*~'()!]
- 文字字符(正则表达式):
[-a-zA-Z0-9._*]
- 空格符
" "
被转换成加号"+"
.
- literal characters (regex representation):
[-a-zA-Z0-9._*]
- the space character
" "
is converted into a plus sign"+"
.
所以基本上,为了得到想要的结果,使用 URLEncoder.encode(s, "UTF-8")
然后做一些后处理:
So basically, to get the desired result, use URLEncoder.encode(s, "UTF-8")
and then do some post-processing:
- 将所有出现的
"+"
替换为"%20"
- 将所有出现的
"%xx"
代表任何[~'()!]
替换回它们的文字对应部分
- replace all occurrences of
"+"
with"%20"
- replace all occurrences of
"%xx"
representing any of[~'()!]
back to their literal counter-parts
这篇关于Java 相当于产生相同输出的 JavaScript 的 encodeURIComponent 吗?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!