如何正确编码此URL [英] How to encode properly this URL
问题描述
我正在尝试使用JSoup获取此URL
I am trying to get this URL using JSoup
http://betatruebaonline.com/img/parte/330/CIGUEÑAL.JPG
即使使用编码,我也有例外. 我不明白为什么编码错误.它返回
Even using encoding, I got an exception. I don´t understand why the encoding is wrong. It returns
http://betatruebaonline.com/img/parte/330/CIGUEN%C3%91AL.JPG
代替正确的
http://betatruebaonline.com/img/parte/330/CIGUEN%CC%83AL.JPG
我该如何解决? 谢谢.
How I can fix this ? Thanks.
private static void GetUrl()
{
try
{
String url = "http://betatruebaonline.com/img/parte/330/";
String encoded = URLEncoder.encode("CIGUEÑAL.JPG","UTF-8");
Response img = Jsoup
.connect(url + encoded)
.ignoreContentType(true)
.execute();
System.out.println(url);
System.out.println("PASSED");
}
catch(Exception e)
{
System.out.println("Error getting url");
System.out.println(e.getMessage());
}
}
推荐答案
编码没有错误,这里的问题是复合unicode&预先组合的字符Ñ"的unicode可以以两种方式显示,它们看起来相同但确实不同
The encoding is not wrong, the problem here is composite unicode & precomposed unicode of character "Ñ" can be displayed in 2 ways, they look the same but really different
precomposed unicode: Ñ -> %C3%91
composite unicode: N and ~ -> N%CC%83
我强调两者都是正确的,这取决于您想要哪种类型的unicode:
I emphasize that BOTH ARE CORRECT, it depends on which type of unicode you want:
String normalize = Normalizer.normalize("Ñ", Normalizer.Form.NFD);
System.out.println(URLEncoder.encode("Ñ", "UTF-8")); //%C3%91
System.out.println(URLEncoder.encode(normalize, "UTF-8")); //N%CC%83
这篇关于如何正确编码此URL的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!