如何最有效地获取完整的URL地址? [英] How to get the complete URL address most efficiently?

查看:3292
本文介绍了如何最有效地获取完整的URL地址?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用Java程序从短网址获取扩展的网址.给定Java URLConnection,在这两种方法中,哪种方法更好地获得期望的结果?

I'm using a Java program to get expanded URLs from short URLs. Given a Java URLConnection, among the two approaches, which one is better to get the desired result?

Connection.getHeaderField("Location");

vs

Connection.getURL();

我想他们两个都给出相同的输出.第一种方法没有给我最好的结果,只有7分之1得到解决.第二种方法可以提高效率吗?

I guess both of them give the same output. The first approach did not give me the best results, only 1 out of 7 were resolved. Can the efficiency be increased by the second approach?

我们可以使用其他更好的方法吗?

Can we use any other better approach?

推荐答案

我将使用以下内容:

@Test
public void testLocation() throws Exception {
    final String link = "http://bit.ly/4Agih5";

    final URL url = new URL(link);
    final HttpURLConnection urlConnection = (HttpURLConnection) url.openConnection();
    urlConnection.setInstanceFollowRedirects(false);

    final String location = urlConnection.getHeaderField("location");
    assertEquals("http://stackoverflow.com/", location);
    assertEquals(link, urlConnection.getURL().toString());
}

使用setInstanceFollowRedirects(false)时,HttpURLConnection不会跟随重定向,并且仅从bit.ly的重定向页面将不会下载目标页面(在上例中为stackoverflow.com).

With setInstanceFollowRedirects(false) the HttpURLConnection does not follow redirects and the destination page (stackoverflow.com in the above example) will not be downloaded just the redirect page from bit.ly.

一个缺点是,当解析的bit.ly URL指向例如tinyurl.com上的另一个短URL时,您将获得tinyurl.com链接,而不是tinyurl.com重定向到的链接.

One drawback is that when a resolved bit.ly URL points to another short URL for example on tinyurl.com you will get a tinyurl.com link, not what the tinyurl.com redirects to.

修改:

要查看bit.ly的响应,请使用curl:

To see the reponse of bit.ly use curl:

$ curl --dump-header /tmp/headers http://bit.ly/4Agih5
<html>
<head>
<title>bit.ly</title>
</head>
<body>
<a href="http://stackoverflow.com/">moved here</a>
</body>
</html>

如您所见,bit.ly仅发送一个简短的重定向页面.然后检查HTTP标头:

As you can see bit.ly sends only a short redirect page. Then check the HTTP headers:

$ cat /tmp/headers
HTTP/1.0 301 Moved Permanently
Server: nginx
Date: Wed, 06 Nov 2013 08:48:59 GMT
Content-Type: text/html; charset=utf-8
Cache-Control: private; max-age=90
Location: http://stackoverflow.com/
Mime-Version: 1.0
Content-Length: 117
X-Cache: MISS from cam
X-Cache-Lookup: MISS from cam:3128
Via: 1.1 cam:3128 (squid/2.7.STABLE7)
Connection: close

它发送带有Location标头(指向http://stackoverflow.com/)的301 Moved Permanently响应.现代浏览器不会向您显示上面的HTML页面.相反,它们会自动将您重定向到Location标头中的URL.

It sends a 301 Moved Permanently response with a Location header (which points to http://stackoverflow.com/). Modern browsers don't show you the HTML page above. Instead they automatically redirect you to the URL in the Location header.

这篇关于如何最有效地获取完整的URL地址?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆