有没有更快的方式比这找到一个目录下的所有文件和所有子目录? [英] Is there a faster way than this to find all the files in a directory and all sub directories?

查看:331
本文介绍了有没有更快的方式比这找到一个目录下的所有文件和所有子目录?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在写这需要搜索的目录及其所有子目录对于有一定的扩展名的文件的程序。这是怎么回事本地和网络驱动器均可使用,因此性能是有点问题的。

下面就是我现在使用递归方法:

 私人无效GetFileList(字符串fileSearchPattern,串rootFolderPath,名单,其中,FileInfo的>文件)
{
    DirectoryInfo的迪=新的DirectoryInfo(rootFolderPath);

    的FileInfo [] fiArr = di.GetFiles(fileSearchPattern,SearchOption.TopDirectoryOnly);
    files.AddRange(fiArr);

    DirectoryInfo的[] diArr = di.GetDirectories();

    的foreach(在diArr DirectoryInfo的信息)
    {
        GetFileList(fileSearchPattern,info.FullName,文件);
    }
}
 

我可以设置SearchOption为AllDirectories,而不是使用递归的方法,但在未来我将要插入一些code,通知哪个文件夹是当前正在扫描的用户。

在我创建的FileInfo对象的列表,现在我真正关心的是路径的文件。我要的文件,我想比较文件的新名单,看看有什么文件被添加或删除的现有列表。有没有更快的方法来生成的文件路径这个名单?有什么我可以做四周查询的文件共享网络驱动器优化文件搜索?


更新1

我试图创建一个做同样的事情,首先找到所有的子目录,然后反复扫描每个目录下文件的非递归方法。这里的方法:

 公共静态列表< FileInfo的> GetFileList(字符串fileSearchPattern,串rootFolderPath)
{
    DirectoryInfo的ROOTDIR =新的DirectoryInfo(rootFolderPath);

    名单< D​​irectoryInfo的> dirList =新的名单,其中,DirectoryInfo的>(rootDir.GetDirectories(*,SearchOption.AllDirectories));
    dirList.Add(ROOTDIR);

    名单< FileInfo的>的fileList =新的名单,其中,FileInfo的>();

    的foreach(DirectoryInfo的目录中dirList)
    {
        fileList.AddRange(dir.GetFiles(fileSearchPattern,SearchOption.TopDirectoryOnly));
    }

    返回的fileList;
}
 


更新2

好了,所以我已经运行在本地两者有很多的文件(〜1200)的一些测试和远程文件夹。下面是我在运行测试的方法。该结果如下。

  • GetFileListA():在上面的更新非递归的解决方案。我认为这是相当于周杰伦的解决方案。
  • GetFileListB():从原来的问题递归方法
  • GetFileListC():获取的所有目录静态Directory.GetDirectories()方法。然后获取所有与静态Directory.GetFiles()方法中的文件路径。填充并返回一个列表
  • GetFileListD():使用队列,并返回IEnumberable马克Gravell的解决方案。我填充一个列表,由此产生的IEnumerable
    • DirectoryInfo.GetFiles :创建其他方法。实例化从根文件夹路径DirectoryInfo的。使用SearchOption.AllDirectories调用的GetFiles
  • Directory.GetFiles :创建其他方法。通过调用使用SearchOption.AllDirectories目录的静态的GetFiles方法

 方法本地文件夹远程文件夹
GetFileListA()00:00.0781235 05:22.9000502
GetFileListB()00:00.0624988 03:43.5425829
GetFileListC()00:00.0624988 05:19.7282361
GetFileListD()00:00.0468741 03:38.1208120
DirectoryInfo.GetFiles 00:00.0468741 03:45.4644210
Directory.GetFiles 00:00.0312494 03:48.0737459
 

。 。 。所以看起来像马克的是最快的。

解决方案

试试这个迭代器块版本,避免递归和信息目标:

 公共静态的IEnumerable<字符串> GetFileList(字符串fileSearchPattern,串rootFolderPath)
{
    队列<字符串>未决=新问答LT;字符串>();
    pending.Enqueue(rootFolderPath);
    字符串[] tmp目录;
    而(pending.Count大于0)
    {
        rootFolderPath = pending.Dequeue();
        TMP = Directory.GetFiles(rootFolderPath,fileSearchPattern);
        的for(int i = 0; I< tmp.Length;我++)
        {
            收益回报TMP [I]
        }
        TMP = Directory.GetDirectories(rootFolderPath);
        的for(int i = 0; I< tmp.Length;我++)
        {
            pending.Enqueue(TMP [I]);
        }
    }
}
 

还要注意的是4.0具有内置的迭代器块版本(<一href="http://msdn.microsoft.com/en-us/library/dd383571%28VS.100%29.aspx"><$c$c>EnumerateFiles, <一href="http://msdn.microsoft.com/en-us/library/dd383459%28VS.100%29.aspx"><$c$c>EnumerateFileSystemEntries)这可能更快(更直接地访问的文件系统;更小的阵列)

I'm writing a program that needs to search a directory and all its sub directories for files that have a certain extension. This is going to be used both on a local, and a network drive, so performance is a bit of an issue.

Here's the recursive method I'm using now:

private void GetFileList(string fileSearchPattern, string rootFolderPath, List<FileInfo> files)
{
    DirectoryInfo di = new DirectoryInfo(rootFolderPath);

    FileInfo[] fiArr = di.GetFiles(fileSearchPattern, SearchOption.TopDirectoryOnly);
    files.AddRange(fiArr);

    DirectoryInfo[] diArr = di.GetDirectories();

    foreach (DirectoryInfo info in diArr)
    {
        GetFileList(fileSearchPattern, info.FullName, files);
    }
}

I could set the SearchOption to AllDirectories and not use a recursive method, but in the future I'll want to insert some code to notify the user what folder is currently being scanned.

While I'm creating a list of FileInfo objects now all I really care about is the paths to the files. I'll have an existing list of files, which I want to compare to the new list of files to see what files were added or deleted. Is there any faster way to generate this list of file paths? Is there anything that I can do to optimize this file search around querying for the files on a shared network drive?


Update 1

I tried creating a non-recursive method that does the same thing by first finding all the sub directories and then iteratively scanning each directory for files. Here's the method:

public static List<FileInfo> GetFileList(string fileSearchPattern, string rootFolderPath)
{
    DirectoryInfo rootDir = new DirectoryInfo(rootFolderPath);

    List<DirectoryInfo> dirList = new List<DirectoryInfo>(rootDir.GetDirectories("*", SearchOption.AllDirectories));
    dirList.Add(rootDir);

    List<FileInfo> fileList = new List<FileInfo>();

    foreach (DirectoryInfo dir in dirList)
    {
        fileList.AddRange(dir.GetFiles(fileSearchPattern, SearchOption.TopDirectoryOnly));
    }

    return fileList;
}


Update 2

Alright so I've run some tests on a local and a remote folder both of which have a lot of files (~1200). Here are the methods I've run the tests on. The results are below.

  • GetFileListA(): Non-recursive solution in the update above. I think it's equivalent to Jay's solution.
  • GetFileListB(): Recursive method from the original question
  • GetFileListC(): Gets all the directories with static Directory.GetDirectories() method. Then gets all the file paths with the static Directory.GetFiles() method. Populates and returns a List
  • GetFileListD(): Marc Gravell's solution using a queue and returns IEnumberable. I populated a List with the resulting IEnumerable
    • DirectoryInfo.GetFiles: No additional method created. Instantiated a DirectoryInfo from the root folder path. Called GetFiles using SearchOption.AllDirectories
  • Directory.GetFiles: No additional method created. Called the static GetFiles method of the Directory using using SearchOption.AllDirectories

Method                       Local Folder       Remote Folder
GetFileListA()               00:00.0781235      05:22.9000502
GetFileListB()               00:00.0624988      03:43.5425829
GetFileListC()               00:00.0624988      05:19.7282361
GetFileListD()               00:00.0468741      03:38.1208120
DirectoryInfo.GetFiles       00:00.0468741      03:45.4644210
Directory.GetFiles           00:00.0312494      03:48.0737459

. . .so looks like Marc's is the fastest.

解决方案

Try this iterator block version that avoids recursion and the Info objects:

public static IEnumerable<string> GetFileList(string fileSearchPattern, string rootFolderPath)
{
    Queue<string> pending = new Queue<string>();
    pending.Enqueue(rootFolderPath);
    string[] tmp;
    while (pending.Count > 0)
    {
        rootFolderPath = pending.Dequeue();
        tmp = Directory.GetFiles(rootFolderPath, fileSearchPattern);
        for (int i = 0; i < tmp.Length; i++)
        {
            yield return tmp[i];
        }
        tmp = Directory.GetDirectories(rootFolderPath);
        for (int i = 0; i < tmp.Length; i++)
        {
            pending.Enqueue(tmp[i]);
        }
    }
}

Note also that 4.0 has inbuilt iterator block versions (EnumerateFiles, EnumerateFileSystemEntries) that may be faster (more direct access to the file system; less arrays)

这篇关于有没有更快的方式比这找到一个目录下的所有文件和所有子目录?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆