最快的方式来创建在C#中的文件 [英] Fastest way to create files in C#
问题描述
我运行一个程序,以基准如何快速查找和处理大量文件的遍历所有文件的文件夹中。该方法的最慢的部分是创建一百万加文件。我使用的是一个相当幼稚的方法来创建此刻的文件:
I'm running a program to benchmark how fast finding and iterating over all the files in a folder with large numbers of files. The slowest part of the process is creating the 1 million plus files. I'm using a pretty naive method to create the files at the moment:
Console.Write("Creating {0:N0} file(s) of size {1:N0} bytes... ",
options.FileCount, options.FileSize);
var createTimer = Stopwatch.StartNew();
var fileNames = new List<string>();
for (long i = 0; i < options.FileCount; i++)
{
var filename = Path.Combine(options.Directory.FullName,
CreateFilename(i, options.FileCount));
using (var file = new FileStream(filename, FileMode.CreateNew,
FileAccess.Write, FileShare.None, 4096,
FileOptions.WriteThrough))
{
// I have an option to write some data to files, but it's not being used.
// That's why there's a using here.
}
fileNames.Add(filename);
}
createTimer.Stop();
Console.WriteLine("Done.");
// Other code appears here.....
Console.WriteLine("Time to CreateFiles: {0:N3}sec ({1:N2} files/sec, 1 in {2:N4}ms)"
, createTimer.Elapsed.TotalSeconds
, (double)total / createTimer.Elapsed.TotalSeconds
, createTimer.Elapsed.TotalMilliseconds / (double)options.FileCount);
输出:
Output:
Creating 1,000,000 file(s) of size 0 bytes... Done.
Time to CreateFiles: 9,182.283sec (1,089.05 files/sec, 1 in 9.1823ms)
如果有什么明显比这更好的?我期待测试幅度超过1百万大几个数量级,并且需要每天创建文件!
If there anything obviously better than this? I'm looking to test several orders of magnitude larger than 1 million, and it takes a day to create the files!
我没有带任何试图排序的并行性,尝试优化的任何文件系统选项或更改文件创建的顺序。
I havn't tried any sort of parallelism, trying to optimise any file system options or changing the order of file creation.
有关完整性,这里的 CreateFilename()的含量
:
public static string CreateFilename(long i, long totalFiles)
{
if (totalFiles < 0)
throw new ArgumentOutOfRangeException("totalFiles",
totalFiles, "totalFiles must be positive");
// This tries to keep filenames to the 8.3 format as much as possible.
if (totalFiles < 99999999)
// No extension.
return String.Format("{0:00000000}", i);
else if (totalFiles >= 100000000 && totalFiles < 9999999999)
{
// Extend numbers into extension.
long rem = 0;
long div = Math.DivRem(i, 1000, out rem);
return String.Format("{0:00000000}", div) + "." +
String.Format("{0:000}", rem);
}
else
// Doesn't fit in 8.3, so just tostring the long.
return i.ToString();
}
更新
试过parallelise按使用的Parallel.For()
StriplingWarrior的建议。 !结果:约30个线程颠簸我的磁盘和净放缓
Tried to parallelise as per StriplingWarrior's suggestion using Parallel.For()
. Results: about 30 threads thrashing my disk and a net slow down!
var fileNames = new ConcurrentBag<string>();
var opts = new ParallelOptions();
opts.MaxDegreeOfParallelism = 1; // 1 thread turns out to be fastest.
Parallel.For(0L, options.FileCount, opts,
() => new { Files = new List<string>() },
(i, parState, state) =>
{
var filename = Path.Combine(options.Directory.FullName,
CreateFilename(i, options.FileCount));
using (var file = new FileStream(filename, FileMode.CreateNew
, FileAccess.Write, FileShare.None
, 4096, FileOptions.WriteThrough))
{
}
fileNames.Add(filename);
return state;
},
state =>
{
foreach (var f in state.Files)
{
fileNames.Add(f);
}
});
createTimer.Stop();
Console.WriteLine("Done.");
研究发现,改变了 FileOptions
在的FileStream
靠〜50%提高PERF。看来我是关闭任何写入缓存。
Found that changing the FileOptions
in the FileStream
improved perf by ~50%. Seems I was turning off any write cache.
new FileStream(filename, FileMode.CreateNew,
FileAccess.Write, FileShare.None,
4096, FileOptions.None)
结果:
Creating 10,000 file(s) of size 0 bytes... Done.
Time to CreateFiles: 12.390sec (8,071.05 files/sec, 1 in 1.2390ms)
其他的想法仍然欢迎。
Other ideas still welcome.
推荐答案
我发现了身边 File.Create一个简单的循环的最快方法( )
:
IEnumerable filenames = GetFilenames();
foreach (var filename in filenames)
{
File.Create(filename);
}
这是等同于(什么我实际使用的代码):
Which is equivalent to (what I'm actually using in code):
IEnumerable filenames= GetFilenames();
foreach (var filename in filenames)
{
new FileStream(filename, FileMode.CreateNew,
FileAccess.Write, FileShare.None,
4096, FileOptions.None)
}
如果你真的想写点东西到文件中:
And if you actually want to write something to the file:
IEnumerable filenames= GetFilenames();
foreach (var filename in filenames)
{
using (var fs = new FileStream(filename, FileMode.CreateNew,
FileAccess.Write, FileShare.None,
4096, FileOptions.None))
{
// Write something to your file.
}
}
事情似乎并没有帮助:
Things that don't seem to help:
- 在并行的形式
Parallel.ForEach()
或的Parallel.For()
。这将产生一个净增长放缓而得到的线程数量的增加变得更糟。 - 据StriplingWarrior,固态硬盘。没有测试过自己(还),但我推测这可能是因为有这么多的小型写入。
- Parallelism in the form of
Parallel.ForEach()
orParallel.For()
. This produces a net slowdown which gets worse as the number of threads increase. - According to StriplingWarrior, an SSD. Haven't tested myself (yet), but I speculate this may be because there are so many small writes.
这篇关于最快的方式来创建在C#中的文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!