如何正确编码此URL [英] How to encode properly this URL

查看:72
本文介绍了如何正确编码此URL的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试使用JSoup获取此URL

I am trying to get this URL using JSoup

http://betatruebaonline.com/img/parte/330/CIGUEÑAL.JPG

即使使用编码,我也有例外. 我不明白为什么编码错误.它返回

Even using encoding, I got an exception. I don´t understand why the encoding is wrong. It returns

http://betatruebaonline.com/img/parte/330/CIGUEN%C3%91AL.JPG

代替正确的

http://betatruebaonline.com/img/parte/330/CIGUEN%CC%83AL.JPG

我该如何解决? 谢谢.

How I can fix this ? Thanks.

private static void GetUrl()
{
    try
    {
        String url = "http://betatruebaonline.com/img/parte/330/";
        String encoded = URLEncoder.encode("CIGUEÑAL.JPG","UTF-8");
        Response img = Jsoup
                            .connect(url + encoded)
                            .ignoreContentType(true)
                            .execute();

        System.out.println(url);
        System.out.println("PASSED");
    }
    catch(Exception e)
    {
        System.out.println("Error getting url");
        System.out.println(e.getMessage());
    }
}

推荐答案

编码没有错误,这里的问题是复合unicode&预先组合的字符Ñ"的unicode可以以两种方式显示,它们看起来相同但确实不同

The encoding is not wrong, the problem here is composite unicode & precomposed unicode of character "Ñ" can be displayed in 2 ways, they look the same but really different

precomposed unicode: Ñ           -> %C3%91
composite unicode: N and ~       -> N%CC%83

我强调两者都是正确的,这取决于您想要哪种类型的unicode:

I emphasize that BOTH ARE CORRECT, it depends on which type of unicode you want:

String normalize = Normalizer.normalize("Ñ", Normalizer.Form.NFD);
System.out.println(URLEncoder.encode("Ñ", "UTF-8")); //%C3%91
System.out.println(URLEncoder.encode(normalize, "UTF-8")); //N%CC%83

这篇关于如何正确编码此URL的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆