在 Java IOException 过早的 EOF 中读取网页 [英] Reading a web page in Java IOException Premature EOF

查看:36
本文介绍了在 Java IOException 过早的 EOF 中读取网页的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在阅读网页时经常收到过早的 EOF"异常.

I am frequently getting a 'Premature EOF' Exception when reading a web page.

以下是StackTrace

The following is the StackTrace

java.io.IOException: Premature EOF
    at sun.net.www.http.ChunkedInputStream.readAheadBlocking(ChunkedInputStream.java:556)
    at sun.net.www.http.ChunkedInputStream.readAhead(ChunkedInputStream.java:600)
    at sun.net.www.http.ChunkedInputStream.read(ChunkedInputStream.java:687)
    at java.io.FilterInputStream.read(FilterInputStream.java:133)
    at sun.net.www.protocol.http.HttpURLConnection$HttpInputStream.read(HttpURLConnection.java:2968)
    at sun.nio.cs.StreamDecoder.readBytes(StreamDecoder.java:283)
    at sun.nio.cs.StreamDecoder.implRead(StreamDecoder.java:325)
    at sun.nio.cs.StreamDecoder.read(StreamDecoder.java:177)
    at java.io.InputStreamReader.read(InputStreamReader.java:184)
    at java.io.BufferedReader.fill(BufferedReader.java:154)
    at java.io.BufferedReader.readLine(BufferedReader.java:317)
    at java.io.BufferedReader.readLine(BufferedReader.java:382)
    at Utilities.getPage(Utilities.java:24)  while ((line = rd.readLine()) != null) {
    at TalkPage.<init>(TalkPage.java:15)
    at Updater.run(Updater.java:65)

下面是getPage()方法

Following is the getPage() method

public static String getPage(String urlString) throws Exception {
    URL url = new URL(urlString);
    URLConnection conn = url.openConnection();
    BufferedReader rd = new BufferedReader(new InputStreamReader(conn.getInputStream()));
    StringBuffer sb = new StringBuffer();
    String line;
    while ((line = rd.readLine()) != null) {  // LINE 24
        sb.append(line);
    }
    return sb.toString();
}

什么是永久 EOFException,为什么会在这种特殊情况下发生,如何避免?

What is a permature EOFException and why is it occuring in this particular case and how can it be avoided?

一些其他信息:正在读取的页面大小约为 20 KB,我正在我的程序中读取许多此类页面(大约 20 000 个)

Some other information: The size of the page being read is around 20 KB and I'm reading many such pages in my program ( around 20 000 )

推荐答案

这可能是因为您正在逐行阅读内容,而对于最后一行,文件可能缺少返回,以表示行结束.用这个替换你的时间:

This may be because you are reading the content line by line and for the last line the file may be missing a return, to signal the end of line. Replace your while with this:

int BUFFER_SIZE=1024;
char[] buffer = new char[BUFFER_SIZE]; // or some other size, 
int charsRead = 0;
while ( (charsRead  = rd.read(buffer, 0, BUFFER_SIZE)) != -1) {
  sb.append(buffer, 0, charsRead);
}

这篇关于在 Java IOException 过早的 EOF 中读取网页的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆