在Java中如何修复HTTP错误416请求的范围不满足? (从网页下载Web内容时) [英] In java how to fix HTTP error 416 Requested Range Not Satisfiable? (While downloading web content from a web page)
问题描述
我正在尝试下载网页的html内容并获得416状态.我找到了一种解决方案,可以正确地将状态代码提高为200,但仍然无法下载正确的内容.我非常亲密,但是缺少一些东西.请帮忙.
I am trying to download the html content of a web page and getting the 416 status. I found one solution which correctly improves the status code as 200 but still not downloading the proper content. I am very close but missing something. Please help.
具有416状态的代码:
Code with 416 status:
public static void main(String[] args) {
String URL="http://www.xyzzzzzzz.com.sg/";
HttpClient client = new org.apache.commons.httpclient.HttpClient();
org.apache.commons.httpclient.methods.GetMethod method = new org.apache.commons.httpclient.methods.GetMethod(URL);
client.getHttpConnectionManager().getParams().setConnectionTimeout(AppConfig.CONNECTION_TIMEOUT);
client.getHttpConnectionManager().getParams().setSoTimeout(AppConfig.READ_DATA_TIMEOUT);
String html = null; InputStream ios = null;
try {
int statusCode = client.executeMethod(method);
ios = method.getResponseBodyAsStream();
html = IOUtils.toString(ios, "utf-8");
System.out.println(statusCode);
}catch (Exception e) {
e.printStackTrace();
} finally {
if(ios!=null) {
try {ios.close();}
catch (IOException e) {e.printStackTrace();}
}
if(method!=null) method.releaseConnection();
}
System.out.println(html);
}
Code with 200 status (but htmlContent is not proper):
public static void main(String[] args) {
String URL="http://www.xyzzzzzzz.com.sg/";
HttpClient client = new org.apache.commons.httpclient.HttpClient();
org.apache.commons.httpclient.methods.GetMethod method = new org.apache.commons.httpclient.methods.GetMethod(URL);
client.getHttpConnectionManager().getParams().setConnectionTimeout(AppConfig.CONNECTION_TIMEOUT);
client.getHttpConnectionManager().getParams().setSoTimeout(AppConfig.READ_DATA_TIMEOUT);
String html = null; InputStream ios = null;
try {
int statusCode = client.executeMethod(method);
if(statusCode == HttpStatus.SC_REQUESTED_RANGE_NOT_SATISFIABLE) {
method.setRequestHeader("User-Agent", "Mozilla/5.0");
method.setRequestHeader("Accept-Ranges", "bytes=100-1500");
statusCode = client.executeMethod(method);
}
ios = method.getResponseBodyAsStream();
html = IOUtils.toString(ios, "utf-8");
System.out.println(statusCode);
}catch (Exception e) {
e.printStackTrace();
} finally {
if(ios!=null) {
try {ios.close();}
catch (IOException e) {e.printStackTrace();}
}
if(method!=null) method.releaseConnection();
}
System.out.println(html);
}
推荐答案
您的第一个示例代码对我来说没有问题,如果我删除了设置标头代码块,则第二个示例代码就可以工作
Your first sample code works for me without problems, the second sample code works if I remove the set headers code block
if(statusCode == HttpStatus.SC_REQUESTED_RANGE_NOT_SATISFIABLE) {
method.setRequestHeader("User-Agent", "Mozilla/5.0");
method.setRequestHeader("Accept-Ranges", "bytes=100-1500");
statusCode = client.executeMethod(method);
}
这有点奇怪,可能是LAN配置问题(防火墙,代理...等),反正HttpClient 3.1已经很老了,使用
It's a bit strange, a LAN config issue maybe (firewall, proxy... etc), anyway HttpClient 3.1 is quite old, using httpclient 4.x from Apache HttpComponents
import org.apache.commons.io.IOUtils;
import org.apache.http.HttpResponse;
import org.apache.http.client.HttpClient;
import org.apache.http.client.methods.HttpGet;
import org.apache.http.impl.client.DefaultHttpClient;
public class Snippet {
public static void main(String[] args) {
String url = "http://www.jobstreet.com.sg/";
HttpClient client = new DefaultHttpClient();
HttpGet get = new HttpGet(url);
try {
HttpResponse res = client.execute(get);
System.out.println(res.getStatusLine().getStatusCode());
System.out.println(IOUtils.toString(res.getEntity().getContent()));
} catch (Exception e) {
e.printStackTrace();
} finally {
client.getConnectionManager().shutdown();
}
}
}
按预期工作.
尝试使用HttpClient 4,如果仍然遇到相同的错误,则问题不在代码中.
Try with HttpClient 4, if you still getting the same error then the problem is not in your code.
这篇关于在Java中如何修复HTTP错误416请求的范围不满足? (从网页下载Web内容时)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!