使用JSoup将此URL的内容:http://www.aw20.co.uk/images/logo.png保存到文件中 [英] Using JSoup to save the contents of this url: http://www.aw20.co.uk/images/logo.png to a file

查看:395
本文介绍了使用JSoup将此URL的内容:http://www.aw20.co.uk/images/logo.png保存到文件中的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我尝试使用JSoup来获取此网址的内容 http:// www .aw20.co.uk / images / logo.png ,即image logo.png,并将其保存到文件中。到目前为止,我已使用JSoup连接到 http://www.aw20.co.uk 并获得文件。然后我找到了我正在寻找的图像的绝对网址,但现在我不知道如何获得实际图像。所以我希望有人能指出我正确的方向吗?无论如何我也可以使用Jsoup.connect(http://www.aw20.co.uk/images/logo.png)。get();获取图像?

I am try to use JSoup to get the contents of this url http://www.aw20.co.uk/images/logo.png, which is the image logo.png, and save it to a file. So far I have used JSoup to connect to http://www.aw20.co.uk and get a Document. I then went and found the absolute url for the image I am looking for, but now am not sure how to this to get the actual image. So I was hoping someone could point me in the right direction to do so? Also is there anyway I could use Jsoup.connect("http://www.aw20.co.uk/images/logo.png").get(); to get the image?

import java.io.IOException;
import org.jsoup.Jsoup;
import org.jsoup.nodes.Document;
import org.jsoup.nodes.Element;
import org.jsoup.select.Elements;


public class JGet2 {

public static void main(String[] args) {

    try {
        Document doc = Jsoup.connect("http://www.aw20.co.uk").get();

        Elements img = doc.getElementsByTag("img");

        for (Element element : img) {
            String src = element.absUrl("src");

            System.out.println("Image Found!");
            System.out.println("src attribute is: " + src);
            if (src.contains("logo.png") == true) {
                System.out.println("Success");     
            }
            getImages(src);
        }
    } 

    catch (IOException e) {
        e.printStackTrace();
    }
}

private static void getImages(String src) throws IOException {

    int indexName = src.lastIndexOf("/");

    if (indexName == src.length()) {
        src = src.substring(1, indexName);
    }

    indexName = src.lastIndexOf("/");
    String name = src.substring(indexName, src.length());

    System.out.println(name);
}
}


推荐答案

你如果您不想将其解析为HTML,可以使用Jsoup来获取任何URL并将数据作为字节获取。例如:

You can use Jsoup to fetch any URL and get the data as bytes, if you don't want to parse it as HTML. E.g.:

byte[] bytes = Jsoup.connect(imgUrl).ignoreContentType(true).execute().bodyAsBytes();

ignoreContentType(true)已设置,否则设置为Jsoup将抛出一个异常,即内容不是HTML可解析的 - 在这种情况下可以,因为我们使用 bodyAsBytes()来获取响应主体,而不是解析。

ignoreContentType(true) is set because otherwise Jsoup will throw an exception that the content is not HTML parseable -- that's OK in this case because we're using bodyAsBytes() to get the response body, rather than parsing.

查看 Jsoup Connection API 了解更多信息详情。

Check the Jsoup Connection API for more details.

这篇关于使用JSoup将此URL的内容:http://www.aw20.co.uk/images/logo.png保存到文件中的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆