如何使用 .net 桌面应用程序复制网站内容 [英] how to copy contents of a website using a .net desktop application

查看:25
本文介绍了如何使用 .net 桌面应用程序复制网站内容的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

网上有这个网站(这个网站不是我建的),里面有一个网格,它是分页的,所以它跨越了很多页.我想要一个 excel 表格中网格的每一页的内容.如果我尝试手动完成,这是一种非常麻烦且不太聪明的方法.

There is this website on the net (this website is not built by me) which contains a grid and it is paged, so it spans many pages. I want the contents of each page of the grid in an excel sheet. It is a very cumbersome and not-so-very-smart way if I try to do it manually.

那么是否可以使用 c#.net windows 应用程序来做到这一点?

So is it possible to do this using a c#.net windows application?

是否有任何免费软件可以帮助我实现这一目标,例如网络爬虫或网络蜘蛛之类的东西?

Are there any freewares which would help me achieve this, something like a web crawler or a web spider or something like that?

推荐答案

这个术语叫做 Web Scraping.并且用代码来实现并不是一件容易的事.

The term is called Web Scraping. and it is not an easy task to achieve using code.

您可以使用 HttpWebRequest/HttpWebResponse 类或 WebClient 类来访问和获取页面本身.然后您可以使用正则表达式或其他类似HTML Agility Pack 之类的东西来解析您需要的数据.

You can use HttpWebRequest/HttpWebResponse classes or WebClient class to access and get the pages themselves. Then you can use regular expressions or something else like something like HTML Agility Pack to parse the data you need.

至于第三方工具,SO 上已经回答了很多问题,但您可以查看以下问题:什么是好的网络爬虫工具

As for third party tools there are a lot of questions already answered on SO, but here's one you could take a look at: What's a good Web Crawler tool

这篇关于如何使用 .net 桌面应用程序复制网站内容的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆