git lfs会减少推送到Github的文件的大小吗? [英] Does git lfs reduce the size of files pushed to Github?

查看:147
本文介绍了git lfs会减少推送到Github的文件的大小吗?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

Github不允许推送大于100 MB的文件.使用git lfs,可以将大文件推送到Github. 我只是对过程的想法感到好奇:在我看来,git lfs只是一个附加的开关,它可以将大文件(仅通过https:/)推送到Github.但是我无法成像,仅此而已?

Github does not allow to push files larger than 100 MB. Using git lfs, it is possible, to push large files to Github. I am just curious about the idea of the process: To me it seems, that git lfs is just an additional switch which enables the push of large files (via https:/ only) to Github. But I can't image, that's all?

altlassian 州的

Git LFS(大文件存储)是由以下人员开发的Git扩展 Atlassian,GitHub和其他一些开源贡献者认为 通过下载减少大型文件在存储库中的影响 他们的相关版本懒洋洋地.具体来说,大文件是 在结帐过程中下载,而不是在克隆或下载过程中下载 正在获取. Git LFS通过替换您的大型文件来实现此目的 具有微小指针文件的存储库.在正常使用期间,您永远不会 看到这些指针文件,因为它们是由Git LFS自动处理的.

Git LFS (Large File Storage) is a Git extension developed by Atlassian, GitHub, and a few other open source contributors, that reduces the impact of large files in your repository by downloading the relevant versions of them lazily. Specifically, large files are downloaded during the checkout process rather than during cloning or fetching. Git LFS does this by replacing large files in your repository with tiny pointer files. During normal usage, you'll never see these pointer files as they are handled automatically by Git LFS.


一些细节:我有一个小项目,因为有一个大文件,所以我无法将其推送到github.然后,我可以按以下步骤进行迁移和推送:


Some details: I have a small project which I cannot push to github because of say one large file. I can then migrate and push as follows:

git lfs migrate import --everything --include="*.pdf"
git reflog expire --expire-unreachable=now --all
git gc --prune=now
git push origin master
git lfs checkout (? If you have local files with 1 kB only? Happend some days later...)

一切都被推送到Github-甚至是大文件.因此,如果使用git lfs(可以快速安装并且易于使用)允许的话,为什么Github拒绝大文件?

and everthing is pushed to Github - even the large files. Thus, why does Github deny large files, if it is allowed using git lfs (which can be installed quickly and works easily)?

推荐答案

问题不是大文件本身,而是Git存储它们的方式. Git使用解密和压缩功能通过网络存储文件并发送文件. Deltification通过引用另一个文件并仅存储差异来存储数据较少的文件.

The problem isn't with large files per se, but the way that Git stores them. Git stores files and sends files over the network using deltification and compression. Deltification stores a file with less data by making reference to another file and storing only the differences.

当服务器端重新打包存储的数据时,Git还将通过运行git fsck来验证数据是否完好无损.这意味着必须至少部分地对每个文件进行解压缩,删除和处理.对于大文件,这将导致使用大量CPU和内存,从而影响服务器上存储的其他存储库.文件也可能会被重新删除,这意味着该文件和其他文件必须完全读取到内存中,而不是与其他文件进行比较(而要付出一定的代价),然后对其进行重写和重新压缩.另一种选择是简单地存储这些文件而不进行评估,仅压缩它们,但这会导致磁盘使用失控,尤其是对于压缩效果不佳的文件.

When the server side repacks the stored data, Git will also verify that the data is still intact by running git fsck. This means that every file must be decompressed, de-deltified, and processed into memory at least partially. For large files, this causes a huge amount of CPU and memory to be used, which impacts other repositories stored on the server. Files may also be re-deltified, which means that that file and other files must be read entirely into memory, compared against other files at some cost, and then rewritten and re-compressed. The alternative is to simply store those files without deltification and only compress them, but this leads to out-of-control disk usage, especially for files which don't compress well.

在客户端,用户必须在克隆上下载整个存储库.这会导致使用大量带宽来克隆通常无法压缩的大文件,这意味着用户必须将所有这些内容本地存储,即使他们只对少数修订版感兴趣.

On the client side, a user must download the entire repository on a clone. This leads to using a large amount of bandwidth to clone large files, which are often uncompressable, and means that a user must store all of this content locally, even if they're only interested in a few revisions.

Git LFS通过使用单独的基于HTTP的协议并允许将对象上载到不属于主Git存储库的单独位置,从而消除了Git存储库中的所有存储.这意味着可以避免Git进行压缩和删除所带来的费用,并且用户只能下载他们当前签出所需的文件.这意味着服务器负载和带宽都大大降低了,客户端存储需求也大大降低了.

Git LFS does away with all the storage in the Git repository by using a separate HTTP-based protocol and allowing the objects to be uploaded to a separate location that isn't part of the main Git repository. This means that the costs Git imposes for compression and deltification are avoided, and users can download only the files they need for their current checkout. This means that server load and bandwidth are both greatly reduced, as are client storage needs.

这篇关于git lfs会减少推送到Github的文件的大小吗?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆