System.IO.Compression.ZipArchive 内存管理 [英] System.IO.Compression.ZipArchive memory management

查看:25
本文介绍了System.IO.Compression.ZipArchive 内存管理的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在 .Net 4.5 中 System.IO.Compression.ZipArchive 类得到了一些更新.

in .Net 4.5 the System.IO.Compression.ZipArchive class get some updates.

在这里可读(http://msdn.microsoft.com/en-us/magazine/jj133817.aspx) 它现在应该执行典型操作不需要将整个存档读入内存".

As readable here (http://msdn.microsoft.com/en-us/magazine/jj133817.aspx) it should now do "typical operations don’t require reading the entire archive into memory".

为了测试,我尝试压缩 10 个文件,每个 200MB 大小.

For testing I try to compress 10 files, each 200MB size.

如果您使用此代码创建新的 zip 存档,则效果很好(整个过程中内存使用率低):

This works good if you create new zip archives with this code (low memory usage over complete process):

for (int directoryGroupIndex = 0; directoryGroupIndex < directoryGroups.Count; directoryGroupIndex++)
{
  String directoryGroupKey = directoryGroups.Keys.ElementAt(directoryGroupIndex);
  FileInfo[] directoryGroup = directoryGroups[directoryGroupKey];

  String archiveFileName = String.Format("Readed Logfiles{0}", archiveFileExtension);
  String archiveFileFullName = Path.Combine(directoryGroupKey, archiveFileName);
  FileInfo archiveFile = new FileInfo(archiveFileFullName);


  using (FileStream archiveFileStream = new FileStream(archiveFile.FullName, FileMode.OpenOrCreate, FileAccess.Write, FileShare.Read))
  using (ZipArchive archive = new ZipArchive(archiveFileStream, ZipArchiveMode.Create, false))
  {
    for (int directoryGroupFileIndex = 0; directoryGroupFileIndex < directoryGroup.Length; directoryGroupFileIndex++)
    {
      FileInfo file = directoryGroup[directoryGroupFileIndex];
      String archiveEntryName = file.Name;
      String archiveEntryPath = DateTime.Now.ToString("yyyy-MM-dd");
      String archiveEntryFullName = Path.Combine(archiveEntryPath, archiveEntryName);

      ZipArchiveEntry archiveEntry = archive.CreateEntryFromFile(file.FullName, archiveEntryFullName, CompressionLevel.Optimal);
    }
  }              
}

现在我想添加新的条目到这个存档.我保留我的代码并再次运行它.(在根目录中有新文件)如果我查看文档,我会读到我想要的只允许创建新的存档条目".所以我的代码应该没问题.

Now I want to add new entries to this archive. I leave my code as it is and run it again. (with new files inside root directory) If I look into the documentaion I read "Only creating new archive entries is permitted" that all I want. So my code should be fine.

现在的结果是:

  1. 存档内的文件表被覆盖(仅列出新文件).

  1. the file table inside the archive is overwritten (only the new files are listed).

存档文件的大小增加了(就像旧文件还在里面一样).

The archive file size has grown (like the old ones are still in there).

存档已损坏.您可以打开它,但不能对内容进行解压缩.

The archive is corrupted. You can open it but you can't decopmress the Content.

如果我将 ZipArchiveMode 更改为ZipArchiveMode.Update",它会按预期工作,但仅限于小文件.像我这样的文件,会抛出内存不足异常,因为完整的存档已加载到内存中.

If I change the ZipArchiveMode to "ZipArchiveMode.Update" it works like expected, but only with small files. Files like my, throw a out-of-Memory exception, because the complete archive is loaded to memory.

我现在的问题是:我做错了吗,这是一个错误还是设计缺陷?

My question now is: Am I doing it wrong, is this a bug or is it a design flaw?

推荐答案

您编写的代码导致 ZipArchive 类在上一个存档的末尾编写一个全新的存档,这当然会损坏文件.

The code you've written is causing the ZipArchive class to write a whole new archive at the end of your previous one, which of course corrupts the file.

按照您的意愿进行操作的方法是在创建新文件时将原始存档复制到一个新文件中,然后用新文件替换原始存档.例如:

The way to do what you want is to copy the original archive to a new file as you create it, and then replace the original with the new one. For example:

string tempFile = Path.GetTempFileName();

using (ZipArchive original =
    new ZipArchive(File.Open(archiveFileStream, FileMode.Open), ZipArchiveMode.Read))
using (ZipArchive newArchive =
    new ZipArchive(File.Open(tempFile, FileMode.Create), ZipArchiveMode.Create))
{
    foreach (ZipArchiveEntry entry in original.Entries)
    {
        ZipArchiveEntry newEntry = newArchive.Create(entry.FullName);

        using (Stream source = entry.Open())
        using (Stream destination = newEntry.Open())
        {
            source.CopyTo(destination);
        }
    }

    for (int directoryGroupFileIndex = 0;
            directoryGroupFileIndex < directoryGroup.Length;
            directoryGroupFileIndex++)
    {
        FileInfo file = directoryGroup[directoryGroupFileIndex];
        String archiveEntryName = file.Name;
        String archiveEntryPath = DateTime.Now.ToString("yyyy-MM-dd");
        String archiveEntryFullName = Path.Combine(archiveEntryPath, archiveEntryName);

        ZipArchiveEntry archiveEntry = newArchive.CreateEntryFromFile(
            file.FullName, archiveEntryFullName, CompressionLevel.Optimal);
    }
}

File.Delete(archiveFileStream);
File.Move(tempFile, archiveFileStream);

请注意,这实际上不会比 ZipArchiveMode.Update 慢.当您使用更新模式时,ZipArchive 类将整个存档读取到内存中(如您所述),然后当您关闭它时,它会重新压缩并将所有内容写回.

Note that this isn't actually going to be slower than ZipArchiveMode.Update. When you use the update mode, the ZipArchive class reads the entire archive into memory (as you noted), and then when you close it, it recompresses and writes everything back out.

上面的计算基本相同,只是简单地使用磁盘作为中间存储而不是内存.

The above does basically the exact same computations, but simply uses the disk as the intermediate storage instead of memory.

这篇关于System.IO.Compression.ZipArchive 内存管理的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆