Java - 检查URL是否存在的最快方法 [英] Java - Quickest way to check if URL exists

查看:294
本文介绍了Java - 检查URL是否存在的最快方法的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

您好我正在编写一个程序,该程序遍历许多不同的URL,只检查它们是否存在。我基本上检查返回的错误代码是否为404。但是,当我检查超过1000个URL时,我希望能够非常快速地完成此操作。以下是我的代码,我想知道如何修改它以便快速工作(如果可能):

Hi I am writing a program that goes through many different URLs and just checks if they exist or not. I am basically checking if the error code returned is 404 or not. However as I am checking over 1000 URLs, I want to be able to do this very quickly. The following is my code, I was wondering how I can modify it to work quickly (if possible):

final URL url = new URL("http://www.example.com");
HttpURLConnection huc = (HttpURLConnection) url.openConnection();
int responseCode = huc.getResponseCode();

if (responseCode != 404) {
System.out.println("GOOD");
} else {
System.out.println("BAD");
}

使用JSoup会更快吗?

Would it be quicker to use JSoup?

我知道有些网站提供代码200并有自己的错误页面,但我知道我正在检查的链接不这样做,所以这不是必需的。

I am aware some sites give the code 200 and have their own error page, however I know the links that I am checking dont do this, so this is not needed.

推荐答案

尝试发送HEAD请求而不是获取请求。这应该更快,因为没有下载响应正文。

Try sending a "HEAD" request instead of get request. That should be faster since the response body is not downloaded.

huc.setRequestMethod("HEAD");

再次检查响应状态是否为400,而不是检查它是否为200.检查是否为200。积极而不是消极。 404,403,402 ..所有40x状态几乎等同于无效的不存在网址。

Again instead of checking if response status is not 400, check if it is 200. That is check for positive instead of negative. 404,403,402.. all 40x statuses are nearly equivalent to invalid non-existant url.

您可以使用多线程来使其更快。

You may make use of multi-threading to make it even faster.

这篇关于Java - 检查URL是否存在的最快方法的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆