C# HttpWebRequest 命令获取目录列表 [英] C# HttpWebRequest command to get directory listing

查看:27
本文介绍了C# HttpWebRequest 命令获取目录列表的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我需要一个简短的代码片段来从 HTTP 服务器获取目录列表.

I need a short code snippet to get a directory listing from an HTTP server.

谢谢

推荐答案

代码前的几个重要注意事项:

A few important considerations before the code:

  1. 必须将 HTTP 服务器配置为允许列出所需目录的目录;
  2. 因为目录列表是普通的 HTML 页面,所以没有定义目录列表格式的标准;
  3. 出于考虑2,您必须为每个服务器放置特定代码.
  1. The HTTP Server has to be configured to allow directories listing for the directories you want;
  2. Because directory listings are normal HTML pages there is no standard that defines the format of a directory listing;
  3. Due to consideration 2 you are in the land where you have to put specific code for each server.

我的选择是使用正则表达式.这允许快速解析和定制.您可以为每个站点获取特定的正则表达式模式,这样您就有了一种非常模块化的方法.如果您计划在不更改源代码的情况下通过新站点支持增强解析模块,请使用外部源将 URL 映射到正则表达式模式.

My choice is to use regular expressions. This allows for rapid parsing and customization. You can get specific regular expressions pattern per site and that way you have a very modular approach. Use an external source for mapping URL to regular expression patterns if you plan to enhance the parsing module with new sites support without changing the source code.

http://www.ibiblio.org/pub/ 打印目录列表的示例

Example to print directory listing from http://www.ibiblio.org/pub/

namespace Example
{
    using System;
    using System.Net;
    using System.IO;
    using System.Text.RegularExpressions;

    public class MyExample
    {
        public static string GetDirectoryListingRegexForUrl(string url)
        {
            if (url.Equals("http://www.ibiblio.org/pub/"))
            {
                return "<a href=".*">(?<name>.*)</a>";
            }
            throw new NotSupportedException();
        }
        public static void Main(String[] args)
        {
            string url = "http://www.ibiblio.org/pub/";
            HttpWebRequest request = (HttpWebRequest)WebRequest.Create(url);
            using (HttpWebResponse response = (HttpWebResponse)request.GetResponse())
            {
                using (StreamReader reader = new StreamReader(response.GetResponseStream()))
                {
                    string html = reader.ReadToEnd();
                    Regex regex = new Regex(GetDirectoryListingRegexForUrl(url));
                    MatchCollection matches = regex.Matches(html);
                    if (matches.Count > 0)
                    {
                        foreach (Match match in matches)
                        {
                            if (match.Success)
                            {
                                Console.WriteLine(match.Groups["name"]);
                            }
                        }
                    }
                }
            }

            Console.ReadLine();
        }
    }
}

这篇关于C# HttpWebRequest 命令获取目录列表的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆