是否有可能在git中只存储一个大文件的校验和? [英] Is it possible to store only a checksum of a large file in git?

查看:127
本文介绍了是否有可能在git中只存储一个大文件的校验和?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我是一个生物信息学家,目前正在从基因组文件中提取正常大小的序列。有些基因组文件足够大,我不想将它们放到主要的git存储库中,而我将提取的序列放入git中。



有可能告诉git这是一个大文件 - 不要存储整个文件,只需要取得它的校验和,并让我知道该文件是否丢失或修改。



如果这是不可能的,我想我必须要么忽略大文件,要么像

我写了一个脚本来完成这个任务那类的东西。您可以将.gitattributes文件中的文件模式放入大型媒体中,而不需要在您的git仓库中使用,并且可以将它们存储在S3上。这只是一个起点,但我认为如果你有兴趣,它是可用的。

http://github.com/schacon/git-media



也许这对你有帮助,或者至少会让你看到如何做到这一点,您可以根据您的具体需求进行定制。


I'm a bioinformatician currently extracting normal-sized sequences from genomic files. Some genomic files are large enough that I don't want to put them into the main git repository, whereas I'm putting the extracted sequences into git.

Is it possible to tell git "Here's a large file - don't store the whole file, just take its checksum, and let me know if that file is missing or modified."

If that's not possible, I guess I'll have to either git-ignore the large files, or, as suggested in this question, store them in a submodule.

解决方案

I wrote a script that does this sort of thing. You put file patterns in the .gitattributes file for large media that you don't want going in your git repo and it can store them on S3 instead. It's just a starting point, but I think it's usable if you're interested.

http://github.com/schacon/git-media

Maybe that will help you, or at least show you how it could be done and you can customize it for your specific needs.

这篇关于是否有可能在git中只存储一个大文件的校验和?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆