如何使用HTML敏捷性包检索所有从一个网站的图片？ [英] How can I use HTML Agility Pack to retrieve all the images from a website?

查看：95 发布时间：2016/8/26 21:50:17 c# parsing html-agility-pack

本文介绍了如何使用HTML敏捷性包检索所有从一个网站的图片？的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我刚刚下载的HTMLAgilityPack和文档没有任何例子。

我正在寻找一种方式来从网站下载的所有图像。地址字符串，而不是物理图像。

 ＆LT; IMG SRC =blabalbalbal.jpeg/＆GT;

我需要拉每个img标签的来源。我只是想获得库的感觉以及它可以提供。大家说，这是这项工作的最佳工具。

修改

 公共无效GetAllImages（）
    {
        WebClient的X =新的WebClient（）;
        字符串源= x.DownloadString（@http://www.google.com）;        HtmlAgilityPack.HtmlDocument文档=新HtmlAgilityPack.HtmlDocument（）;
        document.Load（源）;                         //我不能用后代的方法。它不会出现。
        VAR ImageURLS = document.desc
                   。选择（E =＆GT; e.GetAttributeValue（SRC，NULL））
                   。凡（S =＆GT;！String.IsNullOrEmpty（S））;
    }

解决方案

您可以做到这一点使用LINQ，像这样的：

  VAR文件=新HtmlWeb（）负载（URL）。
VAR的url = document.DocumentNode.Descendants（img目录）
                                。选择（E =＆GT; e.GetAttributeValue（SRC，NULL））
                                。凡（S =＆GT;！String.IsNullOrEmpty（S））;

修改：此code现在的实际工作;我已经忘了写 document.DocumentNode 。

I just downloaded the HTMLAgilityPack and the documentation doesn't have any examples.

I'm looking for a way to download all the images from a website. The address strings, not the physical image.

<img src="blabalbalbal.jpeg" />

I need to pull the source of each img tag. I just want to get a feel for the library and what it can offer. Everyone said this was the best tool for the job.

Edit

public void GetAllImages()
    {
        WebClient x = new WebClient();
        string source = x.DownloadString(@"http://www.google.com");

        HtmlAgilityPack.HtmlDocument document = new HtmlAgilityPack.HtmlDocument();
        document.Load(source);

                         //I can't use the Descendants method. It doesn't appear.
        var ImageURLS = document.desc
                   .Select(e => e.GetAttributeValue("src", null))
                   .Where(s => !String.IsNullOrEmpty(s));        
    }

解决方案

You can do this using LINQ, like this:

var document = new HtmlWeb().Load(url);
var urls = document.DocumentNode.Descendants("img")
                                .Select(e => e.GetAttributeValue("src", null))
                                .Where(s => !String.IsNullOrEmpty(s));

EDIT: This code now actually works; I had forgotten to write document.DocumentNode.

这篇关于如何使用HTML敏捷性包检索所有从一个网站的图片？的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

如何使用HTML敏捷性包检索所有从一个网站的图片？ [英] How can I use HTML Agility Pack to retrieve all the images from a website?

问题描述

相关文章

C#/.NET最新文章

热门教程

热门工具

登录关闭

如何使用HTML敏捷性包检索所有从一个网站的图片？ [英] How can I use HTML Agility Pack to retrieve all the images from a website?

问题描述

相关文章

C#/.NET最新文章

热门教程

热门工具

登录 关闭

登录关闭