System.IO.Compression.Zip存档内存管理 [英] System.IO.Compression.ZipArchive memory management

查看:77
本文介绍了System.IO.Compression.Zip存档内存管理的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在.Net 4.5中,System.IO.Compression.ZipArchive类获得了一些更新.

in .Net 4.5 the System.IO.Compression.ZipArchive class get some updates.

此处可读( http://msdn.microsoft.com/en-us/magazine/jj133817.aspx )现在应该执行典型操作不需要将整个档案读入内存".

As readable here (http://msdn.microsoft.com/en-us/magazine/jj133817.aspx) it should now do "typical operations don’t require reading the entire archive into memory".

为了进行测试,我尝试压缩10个文件,每个文件大小为200MB.

For testing I try to compress 10 files, each 200MB size.

如果您使用以下代码创建新的zip存档(在整个过程中内存使用率较低),则效果很好:

This works good if you create new zip archives with this code (low memory usage over complete process):

for (int directoryGroupIndex = 0; directoryGroupIndex < directoryGroups.Count; directoryGroupIndex++)
{
  String directoryGroupKey = directoryGroups.Keys.ElementAt(directoryGroupIndex);
  FileInfo[] directoryGroup = directoryGroups[directoryGroupKey];

  String archiveFileName = String.Format("Readed Logfiles{0}", archiveFileExtension);
  String archiveFileFullName = Path.Combine(directoryGroupKey, archiveFileName);
  FileInfo archiveFile = new FileInfo(archiveFileFullName);


  using (FileStream archiveFileStream = new FileStream(archiveFile.FullName, FileMode.OpenOrCreate, FileAccess.Write, FileShare.Read))
  using (ZipArchive archive = new ZipArchive(archiveFileStream, ZipArchiveMode.Create, false))
  {
    for (int directoryGroupFileIndex = 0; directoryGroupFileIndex < directoryGroup.Length; directoryGroupFileIndex++)
    {
      FileInfo file = directoryGroup[directoryGroupFileIndex];
      String archiveEntryName = file.Name;
      String archiveEntryPath = DateTime.Now.ToString("yyyy-MM-dd");
      String archiveEntryFullName = Path.Combine(archiveEntryPath, archiveEntryName);

      ZipArchiveEntry archiveEntry = archive.CreateEntryFromFile(file.FullName, archiveEntryFullName, CompressionLevel.Optimal);
    }
  }              
}

现在,我要添加新条目到该存档中.我将代码保持原样,然后再次运行.(在根目录中包含新文件)如果我查看文档,则会读到只允许创建新的存档条目".所以我的代码应该没问题.

Now I want to add new entries to this archive. I leave my code as it is and run it again. (with new files inside root directory) If I look into the documentaion I read "Only creating new archive entries is permitted" that all I want. So my code should be fine.

现在的结果是:

  1. 存档中的文件表被覆盖(仅列出新文件).

  1. the file table inside the archive is overwritten (only the new files are listed).

存档文件的大小已增加(就像旧文件一样).

The archive file size has grown (like the old ones are still in there).

档案已损坏.您可以打开它,但不能取消内容的内容.

The archive is corrupted. You can open it but you can't decopmress the Content.

如果我将ZipArchiveMode更改为"ZipArchiveMode.Update",则其工作原理与预期的一样,但仅适用于小文件.像my这样的文件会抛出内存不足异常,因为完整的存档已加载到内存中.

If I change the ZipArchiveMode to "ZipArchiveMode.Update" it works like expected, but only with small files. Files like my, throw a out-of-Memory exception, because the complete archive is loaded to memory.

我现在的问题是:我做错了吗,这是错误还是设计缺陷?

My question now is: Am I doing it wrong, is this a bug or is it a design flaw?

推荐答案

您编写的代码使 ZipArchive 类在上一个存档的末尾编写了一个全新的存档.当然会损坏文件.

The code you've written is causing the ZipArchive class to write a whole new archive at the end of your previous one, which of course corrupts the file.

执行所需操作的方法是在创建原始存档时将其复制到一个新文件中,然后将其替换为新文件.例如:

The way to do what you want is to copy the original archive to a new file as you create it, and then replace the original with the new one. For example:

string tempFile = Path.GetTempFileName();

using (ZipArchive original =
    new ZipArchive(File.Open(archiveFileStream, FileMode.Open), ZipArchiveMode.Read))
using (ZipArchive newArchive =
    new ZipArchive(File.Open(tempFile, FileMode.Create), ZipArchiveMode.Create))
{
    foreach (ZipArchiveEntry entry in original.Entries)
    {
        ZipArchiveEntry newEntry = newArchive.Create(entry.FullName);

        using (Stream source = entry.Open())
        using (Stream destination = newEntry.Open())
        {
            source.CopyTo(destination);
        }
    }

    for (int directoryGroupFileIndex = 0;
            directoryGroupFileIndex < directoryGroup.Length;
            directoryGroupFileIndex++)
    {
        FileInfo file = directoryGroup[directoryGroupFileIndex];
        String archiveEntryName = file.Name;
        String archiveEntryPath = DateTime.Now.ToString("yyyy-MM-dd");
        String archiveEntryFullName = Path.Combine(archiveEntryPath, archiveEntryName);

        ZipArchiveEntry archiveEntry = newArchive.CreateEntryFromFile(
            file.FullName, archiveEntryFullName, CompressionLevel.Optimal);
    }
}

File.Delete(archiveFileStream);
File.Move(tempFile, archiveFileStream);

请注意,这实际上不会比 ZipArchiveMode.Update 慢.当您使用更新模式时, ZipArchive 类会将整个档案读入内存(如您所述),然后在您关闭它时,它将重新压缩并将所有内容写回.

Note that this isn't actually going to be slower than ZipArchiveMode.Update. When you use the update mode, the ZipArchive class reads the entire archive into memory (as you noted), and then when you close it, it recompresses and writes everything back out.

上面的代码基本上进行了完全相同的计算,但是只是将磁盘用作中间存储而不是内存.

The above does basically the exact same computations, but simply uses the disk as the intermediate storage instead of memory.

这篇关于System.IO.Compression.Zip存档内存管理的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆