Web爬网程序可以获取图像吗? [英] Web Scrapper can fetch images?

查看:121
本文介绍了Web爬网程序可以获取图像吗?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

大家好,

我需要开发一个Web应用程序,即Web scraper,它可以从外部网站获取数据,但是它也应该获取图像.

我想知道Web抓取工具是否可以从外部网站获取图像吗?
如果没有,我是否需要检查路径并手动下载?

如果有人知道最好的,可立即使用,免费或购买的刮板机?

谢谢
Meenxi

Hello All,

I need to develop a web application i.e. web scrapper which fetches data from external web site, but it should fetch images also.

I want to know if Web scrapper can fetch images from external web site?
If no, do i have to check for path and download manually?

And if anyone knows best web scrapper which are ready to use, free or buy?

Thanks
Meenxi

推荐答案

图像与您通过HTTP获得的任何其他文档都没有不同.您需要解析当前加载的HTML页面,查找对图像的引用,然后分别下载每个图像.

通过HTTP下载非常简单.使用System.Net.WebRequest类,您的运行时类将由Uri定义,对于HTTP:System.Net.HttpWebRequest:

Images are not different from any other document you get through HTTP. You need to parse your currently loaded HTML page, find references to images and download each one separately.

Downloading via HTTP is fairly simple. Use System.Net.WebRequest class, bit your run-time class will be defined by the Uri and will be System.Net.HttpWebRequest in case of HTTP:

HttpWebRequest webRequest = (HttpWebRequest)WebRequest.Create(url);
webRequest.Proxy = proxy; // typically null
webResponse = (HttpWebResponse)webRequest.GetResponse();
fs = new FileStream(fname, FileMode.Append, FileAccess.Write);
//here you read data from file stream



别忘了您的HTTP文件可以为图像使用不同类型的URL schema .例如,它可以是FTP.使用FTP进行下载就像HTTP一样简单,只有您的运行时类型为System.Net.FtpWebRequest.

请参阅表单更多信息. http://msdn.microsoft.com/en-us/library/system.net.webrequest.aspx [ ^ ]和派生类.

—SA



Don''t forget your HTTP file can use different kinds of URL schema for images. It can be FTP, for example. The download using FTP is as simple as HTTP, only your run-time type will be System.Net.FtpWebRequest.

See form more information http://msdn.microsoft.com/en-us/library/system.net.webrequest.aspx[^] and derived classes.

—SA


这篇关于Web爬网程序可以获取图像吗?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆