在 C# 中在包含超过 20,000 个文件的目录中查找文件的最快方法 [英] Quickest way in C# to find a file in a directory with over 20,000 files

查看:37
本文介绍了在 C# 中在包含超过 20,000 个文件的目录中查找文件的最快方法的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个每天晚上运行的工作,从根目录下有超过 20,000 个子文件夹的目录中提取 xml 文件.结构如下:

I have a job that runs every night to pull xml files from a directory that has over 20,000 subfolders under the root. Here is what the structure looks like:

rootFolder/someFolder/someSubFolder/xml/myFile.xml
rootFolder/someFolder/someSubFolder1/xml/myFile1.xml
rootFolder/someFolder/someSubFolderN/xml/myFile2.xml
rootFolder/someFolder1
rootFolder/someFolderN

所以看上面的内容,结构总是一样的——一个根文件夹,然后是两个子文件夹,然后是一个xml目录,然后是xml文件.我只知道 rootFolder 的名称和 xml 目录.

So looking at the above, the structure is always the same - a root folder, then two subfolders, then an xml directory, and then the xml file. Only the name of the rootFolder and the xml directory are known to me.

下面的代码遍历所有目录,速度极慢.关于如何优化搜索的任何建议,尤其是在目录结构已知的情况下?

The code below traverses through all the directories and is extremely slow. Any recommendations on how I can optimize the search especially if the directory structure is known?

string[] files = Directory.GetFiles(@"\somenetworkpath
ootFolder", "*.xml", SearchOption.AllDirectories);

推荐答案

您最有可能使用 GetDirectories,而不是使用 GetFiles 和进行蛮力搜索,首先获取第一个子文件夹"的列表,然后遍历这些目录,然后对子文件夹重复该过程,循环遍历它们,最后查找 xml 文件夹,最后搜索 .xml 文件.

Rather than doing GetFiles and doing a brute force search you could most likely use GetDirectories, first to get a list of the "First sub folder", loop through those directories, then repeat the process for the sub folder, looping through them, lastly look for the xml folder, and finally searching for .xml files.

现在,至于性能,速度会有所不同,但先搜索目录,然后再访问文件应该会有很大帮助!

Now, as for performance the speed of this will vary, but searching for directories first, THEN getting to files should help a lot!

更新

好的,我做了一些快速测试,您实际上可以比我想象的更进一步优化它.

Ok, I did a quick bit of testing and you can actually optimize it much further than I thought.

以下代码片段将搜索目录结构并在整个目录树中查找所有xml"文件夹.

The following code snippet will search a directory structure and find ALL "xml" folders inside the entire directory tree.

string startPath = @"C:TestingTestinginDebug";
string[] oDirectories = Directory.GetDirectories(startPath, "xml", SearchOption.AllDirectories);
Console.WriteLine(oDirectories.Length.ToString());
foreach (string oCurrent in oDirectories)
    Console.WriteLine(oCurrent);
Console.ReadLine();

如果您将其放入测试控制台应用程序中,您将看到它输出结果.

If you drop that into a test console app you will see it output the results.

现在,一旦有了这个,只需在每个找到的目录中查找 .xml 文件.

Now, once you have this, just look in each of the found directories for you .xml files.

这篇关于在 C# 中在包含超过 20,000 个文件的目录中查找文件的最快方法的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆