我在Jsoup中得到一个SocketTimeoutException:读取超时 [英] I get a SocketTimeoutException in Jsoup: Read timed out

查看:114
本文介绍了我在Jsoup中得到一个SocketTimeoutException:读取超时的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述



当我尝试使用Jsoup解析大量HTML文档时,我得到一个SocketTimeoutException。
例如,我有一个链接列表:


I get a SocketTimeoutException when I try to parse a lot of HTML documents using Jsoup.
For example, I got a list of links :

<a href="www.domain.com/url1.html">link1</a>
<a href="www.domain.com/url2.html">link2</a>
<a href="www.domain.com/url3.html">link3</a>
<a href="www.domain.com/url4.html">link4</a>

对于每个链接,我解析链接到URL的文档(来自href属性)以获取其他那些页面中的信息。
所以我可以想象它需要花费很多时间,但是如何关闭这个例外?

这是整个堆栈的跟踪:

For each link, I parse the document linked to the URL (from the href attribute) to get other pieces of information in those pages.
So I can imagine that it takes lot of time, but how to shut off this exception?
Here is the whole stack trace:

java.net.SocketTimeoutException: Read timed out
    at java.net.SocketInputStream.socketRead0(Native Method)
    at java.net.SocketInputStream.read(Unknown Source)
    at java.io.BufferedInputStream.fill(Unknown Source)
    at java.io.BufferedInputStream.read1(Unknown Source)
    at java.io.BufferedInputStream.read(Unknown Source)
    at sun.net.www.http.HttpClient.parseHTTPHeader(Unknown Source)
    at sun.net.www.http.HttpClient.parseHTTP(Unknown Source)
    at sun.net.www.protocol.http.HttpURLConnection.getInputStream(Unknown Source)
    at java.net.HttpURLConnection.getResponseCode(Unknown Source)
    at org.jsoup.helper.HttpConnection$Response.execute(HttpConnection.java:381)
    at org.jsoup.helper.HttpConnection$Response.execute(HttpConnection.java:364)
    at org.jsoup.helper.HttpConnection.execute(HttpConnection.java:143)
    at org.jsoup.helper.HttpConnection.get(HttpConnection.java:132)
    at app.ForumCrawler.crawl(ForumCrawler.java:50)
    at Main.main(Main.java:15)

谢谢你的朋友们!

编辑:
Hum ...对不起,刚找到解决方案:

Hum... Sorry, just found the solution:

Jsoup.connect(url).timeout(0).get();

希望对其他人有用...:)

Hope that could be useful for someone else... :)

推荐答案

我认为你可以做到

Jsoup.connect("...").timeout(10 * 1000).get(); 

将超时设置为10秒。

这篇关于我在Jsoup中得到一个SocketTimeoutException:读取超时的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆