带有大文件的 Git [英] Git with large files

查看:25
本文介绍了带有大文件的 Git的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有两台服务器,生产和开发.在生产服务器上,我需要将两个应用程序和多个 (6) 数据库 (MySQL) 分发给开发人员进行测试.所有源代码都存储在开发服务器上的 GitLab 中,开发人员仅使用此服务器工作,无权访问生产服务器.当我们发布应用程序时,master 会登录到生产环境并从 Git 中拉取新版本.数据库很大(每个超过 500M,而且还在增加),我需要尽可能轻松地将它们分发给开发人员进行测试.

I have two servers, Production and Development. On Production server, there are two applications and multiple (6) databases (MySQL) which I need to distribute to developers for testing. All source codes are stored in GitLab on Development server and developers are working only with this server and don't have access to production server. When we release an application, master logs into production and pulls new version from Git. The databases are large (over 500M each and counting) and I need to distribute them as easy as possible to developers for testing.

  • 在将数据库转储到单个文件的备份脚本之后,执行将每个数据库推送到其自己的分支的脚本.如果开发人员想要更新他的本地副本,他会拉取这些分支之一.

发现这个不起作用.

生产服务器上的 Cron 每天保存二进制日志并将它们推送到该数据库的分支中.因此,在分支中,每天都会有文件发生变化,而开发人员会拉取他没有的文件.当前的 SQL 转储将以另一种方式发送给开发人员.当存储库的大小变得太大时,我们将向开发人员发送完整转储并刷新存储库中的所有数据并从头开始.

Cron on production server saves binary logs every day and pushes them into the branch of that database. So, in the branch, there are files with daily changes and developer pulls the files he doesn't have. The current SQL dump will be sent to the developer another way. And when the size of the repository becomes too large, we will send full dump to the developers and flush all data in the repository and start from the beginning.

  • 解决方案可行吗?
  • 如果 git 正在向/从存储库推送/拉取,它是上传/下载整个文件,还是只是更改它们(即添加新行或编辑当前行)?
  • Git 能管理这么大的文件吗?不能.
  • 如何设置存储库中保留多少修订版本?与新解决方案无关.
  • 有没有更好的解决方案?我不想强迫开发人员通过 FTP 或类似方式下载如此大的文件.
  • Is the solution possible?
  • If git is pushing/pulling to/from repository, does it upload/download whole files, or just changes in them (i.e. adds new lines or edits the current ones)?
  • Can Git manage so large files? No.
  • How to set how many revisions are preserved in a repository? Doesn't matter with the new solution.
  • Is there any better solution? I don't want to force the developers to download such large files over FTP or anything similar.

推荐答案

rsync 可能是一个不错的选择有效地更新数据库的开发人员副本.

rsync could be a good option for efficiently updating the developers copies of the databases.

它使用增量算法来增量更新文件.这样它只传输已更改或新的文件块.他们当然仍需要先下载完整文件,但以后更新会更快.

It uses a delta algorithm to incrementally update the files. That way it only transfers the blocks of the file that have changed or that are new. They will of course still need to download the full file first but later updates would be quicker.

本质上,您会获得与 git fetch 类似的增量更新,而无需 git clone 会提供的不断扩展的初始副本.损失不是没有历史,但听起来你不需要那个.

Essentially you get a similar incremental update as a git fetch without the ever expanding initial copy that the git clone would give. The loss is not having the history but is sounds like you don't need that.

rsync 是大多数 Linux 发行版的标准部分,如果您在 Windows 上需要它,有一个可用的打包端口:http://itefix.no/cwrsync/

rsync is a standard part of most linux distributions if you need it on windows there is a packaged port available: http://itefix.no/cwrsync/

要将数据库推送给开发人员,您可以使用类似于以下内容的命令:

To push the databases to a developer you could use a command similar to:

rsync -avz path/to/database(s) HOST:/folder

或者开发人员可以提取他们需要的数据库:

Or the developers could pull the database(s) they need with:

rsync -avz DATABASE_HOST:/path/to/database(s) path/where/developer/wants/it

这篇关于带有大文件的 Git的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆