C#中的HTML图像找到并下载 [英] c# find image in html and download them
问题描述
我想下载存储在HTML(网页)的所有图像,我不知道有多少图像将下载的,我我想知道我用HTML敏捷性包
我谷歌搜索,但所有的网站让我更糊涂了,
我想正则表达式,但结果只有一个......
< DIV CLASS =h2_lin>解决方案
人们给你正确的答案 - 你无法挑剔和懒惰,太。 ; - )
如果您使用的是半生不熟的解决方案,你会处理大量的边缘情况。下面是使用 HTML敏捷性包(它包含在获得所有的链接在HTML文档中工作示例在HTML敏捷性包下载)。
和这里的一篇博客文章中,显示了如何的抓住所有的图像与HTML敏捷性包和LINQ
<$ HTML文档中卡特彼勒C $ C> //兵图像结果,第一页
字符串URL =http://www.bing.com/images/search?q=cat&go=&form=QB&qs= N的;
//对于开发的速度,我用的WebClient
WebClient的客户端=新的WebClient();
字符串的html = client.DownloadString(URL);
//加载HTML到敏捷包
的HTMLDocument DOC =新的HTMLDocument();
doc.LoadHtml(HTML);
//现在,使用LINQ来获取所有图片
名单,LT; HtmlNode> imageNodes = NULL;
imageNodes =(从doc.DocumentNode.SelectNodes(// IMG)
HtmlNode节点,在node.Name ==IMG
和;&安培; node.Attributes [级!] = NULL
和;&安培; node.Attributes [阶级] Value.StartsWith。(IMG_)
选择节点).ToList();
的foreach(在imageNodes HtmlNode节点)
{
Console.WriteLine(node.Attributes [SRC]值。);
}
i want download all images stored in html(web page) , i dont know how much image will be download , and i don`t want use "HTML AGILITY PACK"
i search in google but all site make me more confused ,
i tried regex but only one result ... ,
People are giving you the right answer - you can't be picky and lazy, too. ;-)
If you use a half-baked solution, you'll deal with a lot of edge cases. Here's a working sample that gets all links in an HTML document using HTML Agility Pack (it's included in the HTML Agility Pack download).
And here's a blog post that shows how to grab all images in an HTML document with HTML Agility Pack and LINQ
// Bing Image Result for Cat, First Page
string url = "http://www.bing.com/images/search?q=cat&go=&form=QB&qs=n";
// For speed of dev, I use a WebClient
WebClient client = new WebClient();
string html = client.DownloadString(url);
// Load the Html into the agility pack
HtmlDocument doc = new HtmlDocument();
doc.LoadHtml(html);
// Now, using LINQ to get all Images
List<HtmlNode> imageNodes = null;
imageNodes = (from HtmlNode node in doc.DocumentNode.SelectNodes("//img")
where node.Name == "img"
&& node.Attributes["class"] != null
&& node.Attributes["class"].Value.StartsWith("img_")
select node).ToList();
foreach(HtmlNode node in imageNodes)
{
Console.WriteLine(node.Attributes["src"].Value);
}
这篇关于C#中的HTML图像找到并下载的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!