如何在C#中高效地编写大文本文件? [英] How to efficiently write a large text file in C#?

查看:28
本文介绍了如何在C#中高效地编写大文本文件?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在用 C# 创建一个方法,它为 Google 产品 Feed.提要将包含超过 30,000 条记录,文本文件目前的大小约为 7Mb.

I am creating a method in C# which generates a text file for a Google Product Feed. The feed will contain upwards of 30,000 records and the text file currently weighs in at ~7Mb.

这是我目前使用的代码(为简洁起见,删除了一些行).

Here's the code I am currently using (some lines removed for brevity's sake).

public static void GenerateTextFile(string filePath) {

  var sb = new StringBuilder(1000);
  sb.Append("availability").Append("	");
  sb.Append("condition").Append("	");
  sb.Append("description").Append("	");
  // repetitive code hidden for brevity ...
  sb.Append(Environment.NewLine);

  var items = inventoryRepo.GetItemsForSale();

  foreach (var p in items) {
    sb.Append("in stock").Append("	");
    sb.Append("used").Append("	");
    sb.Append(p.Description).Append("	");
    // repetitive code hidden for brevity ...
    sb.AppendLine();
  }

  using (StreamWriter outfile = new StreamWriter(filePath)) {
      result.Append("Writing text file to disk.").AppendLine();
      outfile.Write(sb.ToString());
  }
}

我想知道 StringBuilder 是否适合这项工作.如果我改用 TextWriter 会不会有性能提升?

I am wondering if StringBuilder is the right tool for the job. Would there be performance gains if I used a TextWriter instead?

我对 IO 性能一无所知,因此将不胜感激任何帮助或一般改进.谢谢.

I don't know a ton about IO performance so any help or general improvements would be appreciated. Thanks.

推荐答案

文件 I/O 操作通常在现代操作系统中得到了很好的优化.您不应该尝试为内存中的文件......只需将其逐个写出.FileStream 将负责缓冲和其他性能考虑.

File I/O operations are generally well optimized in modern operating systems. You shouldn't try to assemble the entire string for the file in memory ... just write it out piece by piece. The FileStream will take care of buffering and other performance considerations.

您可以通过移动轻松进行此更改:

You can make this change easily by moving:

using (StreamWriter outfile = new StreamWriter(filePath)) {

到函数的顶部,并摆脱直接写入文件的StringBuilder.

to the top of the function, and getting rid of the StringBuilder writing directly to the file instead.

避免在内存中构建大字符串的原因有几个:

  1. 实际上它的性能可能更差,因为 StringBuilder 必须在您写入时增加其容量,从而导致重新分配和复制内存.
  2. 它可能需要比物理分配更多的内存 - 这可能会导致使用比 RAM 慢得多的虚拟内存(交换文件).
  3. 对于真正的大文件 (> 2Gb),您将耗尽地址空间(在 32 位平台上)并且永远无法完成.
  4. 要将 StringBuilder 内容写入文件,您必须使用 ToString() 这有效地使进程的内存消耗加倍,因为两个副本都必须在内存中一段的时间.如果您的地址空间足够碎片化,以至于无法分配单个连续的内存块,此操作也可能会失败.
  1. It can actually perform worse, because the StringBuilder has to increase its capacity as you write to it, resulting in reallocation and copying of memory.
  2. It may require more memory than you can physically allocate - which may result in the use of virtual memory (the swap file) which is much slower than RAM.
  3. For truly large files (> 2Gb) you will run out of address space (on 32-bit platforms) and will fail to ever complete.
  4. To write the StringBuilder contents to a file you have to use ToString() which effectively doubles the memory consumption of the process since both copies must be in memory for a period of time. This operation may also fail if your address space is sufficiently fragmented, such that a single contiguous block of memory cannot be allocated.

这篇关于如何在C#中高效地编写大文本文件?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆