获取网站中可用的所有页面的列表 [英] Get list of all pages available in a web site

查看:107
本文介绍了获取网站中可用的所有页面的列表的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在C#中有什么方法可以获得网站上所有页面的列表吗?例如,如果我选择使用www.microsoft.com,则应返回:

'www.microsoft.com/shop'

'www.microsoft。 com / products'

'www.microsoft.com/downloads'

'www.microsoft.com / support'

......和所有其他页面...



最后我想要的是获取网站上使用的每个图像资源的链接,所以我想如果我可以获得页面的链接,那么我可以从每个页面下载图像。伪代码:

Is there any way in C# that I can get a list of all of the pages that are on a web site? For example, if I choose to use 'www.microsoft.com' then it should return:
'www.microsoft.com/shop'
'www.microsoft.com/products'
'www.microsoft.com/downloads'
'www.microsoft.com/support'
...and all the other pages for it...

In the end what I want to do is get the links to every image resource that is used on the site so I thought if I can get the links to the pages then I could download the images from each one. Pseudo code:

foreach (WebPage wp in WebSite.GetWebPages("http://www.microsoft.com"))
{
    Console.WriteLine(wp.GetWebUrl.ToString();
    foreach (WebPageImage wpi in WebPage.GetImages(wp))
    {
        WepPageImage.DownloadImage(wpi.GetWebUrl.ToString());
        Console.WriteLine("Image Downloaded: " + wpi.GetWebUrl.ToString());
    }
}
// or something like that



我希望有办法做到这一点以及如何做到这一点。我希望你明白我想做什么。谢谢。


I hope there is a way to do this and how I would do it. I hope you understand what I want to do. Thank you.

推荐答案

所有我真正想要的是所有o的URL列表f属于网站/服务器/域的页面



据我所知,目前还没有这样的东西 - 而且不太可能如果您考虑存在 - 许多站点通过处理404错误(未找到文件或目录)并使用页面或文件夹名称详细信息访问数据库并检索信息来从数据库中提供特定信息。这些不是正版页面,但它们是有效的URL,这将导致一页数据返回给用户。



AFAIK,IIS查找页面在逐个请求的基础上:它不缓存网址并仅提供存在的页面。
"All I really want is a list of URLs to all of the pages belonging to a website/server/domain"

There isn't such a thing as far as I know - and it is very unlikely to exist if you think about it - many sites server up specific information from a DB by processing 404 errors (file or directory not found) and using the page or folder name details to access the DB and retrieve the info. These aren't "genuine" pages, but they are valid URL's which will result in a page of data going back to the user.

AFAIK, IIS looks for pages on a request-by-request basis: it doesn't "cache" urls and serve up only pages which exist.


获取一个数据网格



当你为另一个页面编写response.redirect时,只需将此页面名称添加到datagridview中



轻松
take one datagridview

when you write response.redirect for another page just add this page name into datagridview

its easy


这篇关于获取网站中可用的所有页面的列表的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆