C使用正则表达式搜索Sharp文件夹 [英] C Sharp Folder Search by Using Regular Expression

查看:100
本文介绍了C使用正则表达式搜索Sharp文件夹的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

从顶级目录获取与某个正则表达式匹配的文件夹列表的最有效方法是什么?目前,我只是递归地遍历子文件夹,以查看它们是否与正则表达式匹配,如果匹配,则使用目录路径获取文件名。

What is the most efficient way to get a list of folders from a top level directory that match a certain regular expression? I am currently just recursively iterating over the subfolders to see if they match the regular expression, then if they do, I am grabbing the file name with the directory path.

当前由于位于此目录中的文件夹数量,因此使用当前方法进行此搜索大约需要50分钟。

Currently this search is taking approximately 50 minutes by using the current method due to the amount of folders located in this directory.

private void ProcessFiles(string path, string searchPattern)
{
    string pattern = @"^(\\\\server\\folder1\\subfolder\\(MENS|WOMENS|MENS\sDROPBOX|WOMENS\sDROPBOX)\\((((COLOR\sCHIPS)|(ALL\sMENS\sCOLORS)))|((\d{4})\\(\w+)\\(FINAL\sART|FINAL\sARTWORK)\\(\d{3}))))$";
    DirectoryInfo di = new DirectoryInfo(path);
    try
    {
        Debug.WriteLine("I'm in " + di.FullName);
        if (di.Exists)
        {
            DirectoryInfo[] dirs = di.GetDirectories("*", SearchOption.TopDirectoryOnly);
            foreach (DirectoryInfo d in dirs)
            {
                string[] splitPath = d.FullName.Split('\\');


                var dirMatch = new Regex(pattern, RegexOptions.IgnoreCase);

                if (dirMatch.IsMatch(d.FullName))
                {
                    Debug.WriteLine("---Processing Directory: " + d.FullName + " ---");
                    FileInfo[] files = d.GetFiles(searchPattern, SearchOption.TopDirectoryOnly);
                    AddColor(files, splitPath);
                }
                ProcessFiles(d.FullName, searchPattern);
            }
        }


    }
    catch (Exception e)
    {

    }

}


推荐答案

我会使用类似以下,不需要递归,让BCL为您完成:

I would use something like the following, no need for recursion, let the BCL do that for you:

// I didn't recount the parenetheses...
Regex re = new Regex("MENS|WOMENS|MENS\sDROPBOX|WOMENS\sDROPBOX)\\((((COLOR\sCHIPS)|(ALL\sMENS\sCOLORS)))|((\d{4})\\(\w+)\\(FINAL\sART|FINAL\sARTWORK)\\(\d{3})))");
var dirs = from dir in 
           Directory.EnumerateDirectories(dirPath, "dv_*",
           SearchOption.AllDirectories)
           where re.IsMatch(dir)
           select dir;

如果它仍能运行50分钟,则说明您只是在慢速驱动器,网络或类似设备上。

If it still runs 50 minutes, you're just on a slow drive, a network or similar.

编辑:您编辑了问题。它清楚地表明您正在UNC路径上运行代码。这非常慢,如果需要速度,可以在该服务器上运行它。

you edited your question. It clearly shows you're running your code on an UNC path. This is extremely slow, if you need speed, run it on that server itself.

注意: GetDirectories的行为之间有很大的区别(您使用的)和 EnumerateDirectories 。 Microsoft的文档对此进行解释

Note: there's a big difference between behavior of GetDirectories (that you use) and EnumerateDirectories. Microsoft's documentation says this about it:


EnumerateDirectories和GetDirectories方法的区别如下:使用EnumerateDirectories时,您可以在返回整个集合之前开始枚举名称集合;使用GetDirectories时,必须等待返回整个名称数组,然后才能访问该数组。因此,当您处理许多文件和目录时,EnumerateDirectories可能会更高效。

关于您的问题:它将遍历它有权访问的所有目录,不要让它从您无权访问的目录开始(它将引发异常)。

In regards to your question: it will go through all directories it has access to, don't let it start on a directory you don't have access to (it will raise an exception).

这篇关于C使用正则表达式搜索Sharp文件夹的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆