C#中的HTML图像找到并下载 [英] c# find image in html and download them

查看:104
本文介绍了C#中的HTML图像找到并下载的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想下载存储在HTML(网页)的所有图像,我不知道有多少图像将下载的,我我想知道我用HTML敏捷性包



我谷歌搜索,但所有的网站让我更糊涂了,



我想正则表达式,但结果只有一个......


< DIV CLASS =h2_lin>解决方案

人们给你正确的答案 - 你无法挑剔和懒惰,太。 ; - )



如果您使用的是半生不熟的解决方案,你会处理大量的边缘情况。下面是使用 HTML敏捷性包(它包含在获得所有的链接在HTML文档中工作示例在HTML敏捷性包下载)。



和这里的一篇博客文章中,显示了如何的抓住所有的图像与HTML敏捷性包和LINQ

 <$ HTML文档中卡特彼勒C $ C> //兵图像结果,第一页
字符串URL =http://www.bing.com/images/search?q=cat&go=&form=QB&qs= N的;

//对于开发的速度,我用的WebClient
WebClient的客户端=新的WebClient();
字符串的html = client.DownloadString(URL);

//加载HTML到敏捷包
的HTMLDocument DOC =新的HTMLDocument();
doc.LoadHtml(HTML);

//现在,使用LINQ来获取所有图片
名单,LT; HtmlNode> imageNodes = NULL;
imageNodes =(从doc.DocumentNode.SelectNodes(// IMG)
HtmlNode节点,在node.Name ==IMG
和;&安培; node.Attributes [级!] = NULL
和;&安培; node.Attributes [阶级] Value.StartsWith。(IMG_)
选择节点).ToList();

的foreach(在imageNodes HtmlNode节点)
{
Console.WriteLine(node.Attributes [SRC]值。);
}


i want download all images stored in html(web page) , i dont know how much image will be download , and i don`t want use "HTML AGILITY PACK"

i search in google but all site make me more confused ,

i tried regex but only one result ... ,

解决方案

People are giving you the right answer - you can't be picky and lazy, too. ;-)

If you use a half-baked solution, you'll deal with a lot of edge cases. Here's a working sample that gets all links in an HTML document using HTML Agility Pack (it's included in the HTML Agility Pack download).

And here's a blog post that shows how to grab all images in an HTML document with HTML Agility Pack and LINQ

	// Bing Image Result for Cat, First Page
	string url = "http://www.bing.com/images/search?q=cat&go=&form=QB&qs=n";

	// For speed of dev, I use a WebClient
	WebClient client = new WebClient();
	string html = client.DownloadString(url);

	// Load the Html into the agility pack
	HtmlDocument doc = new HtmlDocument();
	doc.LoadHtml(html);

	// Now, using LINQ to get all Images
	List<HtmlNode> imageNodes = null;
	imageNodes = (from HtmlNode node in doc.DocumentNode.SelectNodes("//img")
	              where node.Name == "img"
	              && node.Attributes["class"] != null
	              && node.Attributes["class"].Value.StartsWith("img_")
	              select node).ToList();

	foreach(HtmlNode node in imageNodes)
	{
	    Console.WriteLine(node.Attributes["src"].Value);
	}

这篇关于C#中的HTML图像找到并下载的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆