在C#中最快的方式找到一个目录中的文件有超过20,000文件 [英] Quickest way in C# to find a file in a directory with over 20,000 files

查看:299
本文介绍了在C#中最快的方式找到一个目录中的文件有超过20,000文件的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有每天晚上运行从拥有超过20,000子文件夹下的根目录中获取XML文件的工作。下面是该结构是这样的:

  rootFolder / someFolder / someSubFolder / XML / myfile.xml中
rootFolder / someFolder / someSubFolder1 / XML / myFile1.xml
rootFolder / someFolder / someSubFolderN / XML / myFile2.xml
rootFolder / someFolder1
rootFolder / someFolderN
 

所以,看着上面的结构始终是相同的 - 根文件夹,然后两个子文件夹,然后一个XML目录,然后xml文件。 只有rootFolder和xml目录的名称是众所周知的我

在code以下遍历所有的目录,是非常缓慢的。我如何可以优化搜索,特别是如果该目录结构是已知的任何建议?

 的String []文件= Directory.GetFiles(@\\ somenetworkpath \ rootFolder,*的.xml,SearchOption.AllDirectories);
 

解决方案

而不是做的GetFiles并做了蛮力搜索你可以最有可能使用GetDirectories,首先通过这些来获得第一子文件夹的列表,循环目录,然后重复上述过程的子文件夹,通​​过这些循环,最后认准XML文件夹,最后搜索.xml文件。

现在,作为表现这个速度会有所不同,但搜索目录优先,然后让到文件应该有很大的帮助!

更新

好吧,我做测试的快速一下,你其实可以优化它远远超出我的想法。

下面code段将搜索的目录结构,发现所有的XML整个目录树中的文件夹。

 字符串startPath = @C:\测试\测试\斌\调试;
字符串[] oDirectories = Directory.GetDirectories(startPath,XML,SearchOption.AllDirectories);
Console.WriteLine(oDirectories.Length.ToString());
的foreach(在oDirectories串oCurrent)
    Console.WriteLine(oCurrent);
到Console.ReadLine();
 

如果你放弃了到测试控制台应用程序,你会看到它的输出结果。

现在,一旦你有了这个,就看每个找到的目录为您.xml文件。

I have a job that runs every night to pull xml files from a directory that has over 20,000 subfolders under the root. Here is what the structure looks like:

rootFolder/someFolder/someSubFolder/xml/myFile.xml
rootFolder/someFolder/someSubFolder1/xml/myFile1.xml
rootFolder/someFolder/someSubFolderN/xml/myFile2.xml
rootFolder/someFolder1
rootFolder/someFolderN

So looking at the above, the structure is always the same - a root folder, then two subfolders, then an xml directory, and then the xml file. Only the name of the rootFolder and the xml directory are known to me.

The code below traverses through all the directories and is extremely slow. Any recommendations on how I can optimize the search especially if the directory structure is known?

string[] files = Directory.GetFiles(@"\\somenetworkpath\rootFolder", "*.xml", SearchOption.AllDirectories);

解决方案

Rather than doing GetFiles and doing a brute force search you could most likely use GetDirectories, first to get a list of the "First sub folder", loop through those directories, then repeat the process for the sub folder, looping through them, lastly look for the xml folder, and finally searching for .xml files.

Now, as for performance the speed of this will vary, but searching for directories first, THEN getting to files should help a lot!

Update

Ok, I did a quick bit of testing and you can actually optimize it much further than I thought.

The following code snippet will search a directory structure and find ALL "xml" folders inside the entire directory tree.

string startPath = @"C:\Testing\Testing\bin\Debug";
string[] oDirectories = Directory.GetDirectories(startPath, "xml", SearchOption.AllDirectories);
Console.WriteLine(oDirectories.Length.ToString());
foreach (string oCurrent in oDirectories)
    Console.WriteLine(oCurrent);
Console.ReadLine();

If you drop that into a test console app you will see it output the results.

Now, once you have this, just look in each of the found directories for you .xml files.

这篇关于在C#中最快的方式找到一个目录中的文件有超过20,000文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆