HttpClient 从响应中获取图像 [英] HttpClient Get images from response

查看:59
本文介绍了HttpClient 从响应中获取图像的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我使用 Apache HttpClient 来执行 GET/POST 请求,

I'm using Apache HttpClient to perform GET/POST requests,

我想知道您是否可以保存通过响应加载/检索的图像,而无需再次下载它们的 URL.

I was wondering if you could save the images loaded/retrieved by a response, without having to download them again with their URLs.

这个问题好像一年前就有人问过,但没有人回答:我可以使用 HttpClient 获取缓存的图片吗?

This question has been asked like one year ago, but no one answered: Can I get cached images using HttpClient?

我试过了:

CloseableHttpClient httpclient = HttpClients.createDefault();

HttpGet httpget = new HttpGet(url);

HttpResponse response = httpclient.execute(httpget);
HttpEntity entity = response.getEntity();

InputStream is = entity.getContent();

FileOutputStream fos = new FileOutputStream(new File("img.png"));
int inByte;
while ((inByte = is.read()) != -1) {
    fos.write(inByte);
}
is.close();
fos.close();

但显然它只下载文本,我可以让 HttpClient 下载该特定 URL 的图像吗?这可行不可行?

but apparently it's downloading only text, can i make HttpClient download images of that particular URL or not? Is this doable or not?

推荐答案

网页只是页面的 HTML 代码.

A web page is just the HTML code of the page.

当浏览器访问网页时,它会下载 HTML 代码,然后解析 HTML.如果有诸如 IMG 标签、嵌入对象(如 Flash、Applets 等)、框架等内容,浏览器会获取它们的 URL,并创建一个新的 HTTP 连接,在其中下载图像.它对每个图像都这样做.然后,准备好页面的所有不同部分(在缓存中),它呈现页面.

When a browser accesses a webpage, it downloads the HTML code, and then parses the HTML. If there are things like IMG tags, embeded objects (like Flash, Applets etc.), frames and so on, the browser takes their URL, and creates a new HTTP connection, in which it downloads the image. It does so for every image. And then, having all the various parts of the page ready (in cache), it renders the page.

当然,这是一个简化的描述,因为浏览器倾向于通过保持连接打开并保持缓存来优化这些东西.所以重申一下,要在页面中获取图像:

This is a simplified description, of course, as browsers tend to optimize these things by keeping connections open and keeping caches around. So to reiterate, to get the images in a page:

  1. 从给定的 URL 下载 HTML.
  2. 解析 HTML 并找到 IMG 标签.
  3. 对于每个相关的 IMG,从与其关联的 SRC URL 下载图像数据.您应该将它们保存到文件中.

重要的是要了解 HttpClient 响应仅代表一个对象 - HTML 页面或单个图像,具体取决于您提供的 URL.如果您想下载整个页面及其所有图像,您必须自己为每个对象使用 HttpClient - 它不会自动下载.

It is important to understand that an HttpClient response only represents one object - the HTML page, or a single image, depending what URL you gave it. If you want to download an entire page and all its images, you have to use an HttpClient for each of the objects yourself - it doesn't do so automatically.

这篇关于HttpClient 从响应中获取图像的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆