有效压缩具有许多相同文件的文件系统目录树 [英] Efficient compression of a file system directory tree with many identical files

查看：37 发布时间：2021/9/26 19:18:28 windows zip

本文介绍了有效压缩具有许多相同文件的文件系统目录树的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我们有多个 .NET Web 应用程序，它们都共享相当多的公共库.他们都不在 GAC 中.

We have multiple .NET web applications all sharing quite a few common libraries. None of them are in the GAC.

部署限制是所有这些 Web 应用程序都具有专用目录.导致整个目录结构中有大量重复的dll.

The deployment constraint is that all of these web applications have dedicated directories. Which results in large amount of duplicated dlls in the total directory structure.

此目录结构是从单个 zip 存档中提取的.

This directory structure is extracted from a single zip archive.

因此 zip 存档在不同的目录中有许多相同的文件.

As a result the zip archive has many identical files found in different directories.

这是巨大的冗余，我想在 zip 存档中消除它，我不太关心磁盘上是否创建了冗余文件.我看到两种优化 zip 的方法:

This is huge redundancy, which I want to eliminate in the zip archive, I do not care much if redundant files are created on the disk. I see two ways optimize the zip:

使用 Windows 符号链接和联结来减少物理相同文件的数量.
使用不会将同一文件数据压缩两次的智能压缩.

方法一

我使用 zip 和 7z 来测试压缩目录结构.我使用联结和文件符号链接作为减少磁盘空间的手段.

I used zip and 7z to test compressing directory structures. I used junctions and file symbolic links as the means to reduce space on disk.

不幸的是，zip 和 7z 都像压缩目录一样压缩连接.符号链接被 7z 压缩为零长度文件，解压缩后其作为符号链接的性质将丢失.zip 会遍历符号链接并压缩目标数据，这会导致存档中出现重复的文件内容.

Unfortunately, both zip and 7z compress junctions as if they were full blown directories. A symbolic link is compressed as a zero length file by 7z, its nature as a symbolic link is lost upon decompression. zip traverses the symbolic link and compresses the target data instead, which results in duplicate file content in the archive.

简而言之，我没有使用第一种方法消除重复的文件数据.

In short I failed to eliminate the duplicate file data using the first method.

方法二

我想要的是http://sourceforge.net/p/Sevenzip/feature-requests/794/.然而，这只不过是一个功能请求.

What I want is exactly described by http://sourceforge.net/p/sevenzip/feature-requests/794/. However, it is nothing more than a feature request.

对功能请求的评论提到 lrzip 作为一种高效的大文件压缩器.我必须检查它，但它似乎并没有像我希望的那样消除重复的文件数据.

A comment to the feature request mentions lrzip as an efficient huge file compressor. I have to check it, but it does not seem to eliminate duplicate file data the way I would like it to be.

欢迎任何帮助.

有效压缩具有许多相同文件的文件系统目录树 [英] Efficient compression of a file system directory tree with many identical files

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

有效压缩具有许多相同文件的文件系统目录树 [英] Efficient compression of a file system directory tree with many identical files

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭