在本地跟踪文件,但绝不允许将它们推送到远程存储库 [英] Track files locally, but never allow them to be pushed to the remote repository

查看:116
本文介绍了在本地跟踪文件,但绝不允许将它们推送到远程存储库的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在研究一个涉及使用非常敏感数据的项目,并且我被指示只通过自定义文件传输系统在线传输此数据。该项目本身在git源代码控制下,并包含一个包含敏感数据的sqlite文件。



到目前为止,我只是忽略了sqlite文件,通过 gitignore 文件,这可以防止它被推送到远程存储库。然而,我现在已经到了项目中的一个地方,在这个项目中我们有一个实时版本以及一个开发版本,而且数据没有在本地被跟踪的事实使得使用分支非常困难。



所以我的问题是:有没有办法让我在本地跟踪sqlite文件,这样我可以在不同的分支上拥有不同的数据版本,但从来没有推送到远程存储库? / p>

阅读,我考虑过使用不同的 gitignore 文件的仅限本地的开发分支,但事实上 git merge 到远程共享分支也会将更改合并到 gitignore 文件中,很快就会变得麻烦。

好的,我真的想出了一个更好的解决方案来解决这个问题。 由于我正在使用的sqlite文件的大小,我以前的解决方案(涉及第二个git存储库)很快就成了问题; git 不能处理大文件。我调查了各种方法来改善git处理文件的能力(例如 git-bigfiles git-annex ),但没有任何东西能够优雅地处理我的情况。



答案:符号链接。



NB这个解决方案非常适用于Unix,但是您可能可以在非Unix系统上修改它。



问题#1:确保数据是从不发送到远程存储库。



这个很简单。与我以前的解决方案类似,我将数据存储在存储库之外。

 根目录/ 
My-Project /
.git /
源代码和资料/
My-Project-Data /
A-Big-Sqlite-File.sqlite

因为数据文件不在存储库中,所以不必担心它们被git索引。



问题2:不同的分支应该引用不同版本的数据。



这是符号链接起作用的地方。符号链接实际上是一个文件的快捷方式,所以我们的想法是将一个符号链接放到存储库中的数据文件中。符号链接由git索引(而且它们非常小),因此不同的分支可以有不同的符号链接。

为了解释这一点,我们来看一个示例项目,它有一个目前在主控分支上存在实时版本(1.1);和 version-1.2 分支上的新版本(1.2)。为简单起见,该项目只有一个数据文件: Data.sqlite



数据文件存储在 My- Project-Data 目录,并在文件系统上进行版本控制,如下所示:

  My-Project-Data / 
v1.1 /
Data.sqlite
v1.2 /
Data.sqlite

使用符号链接将数据文件添加到存储库:

  My-Project / 
.git /
源代码和资料/
Data-Symlink.sqlite

master 分支上, Data-Symlink.sqlite

  ../../ My-Project-Data / v1.1 / Data.sqlite 

version-1.2 分支

  ../../ My -Project-Data / v1.2 / Data.sqlite 

因此,当开发版本1.3开始时,下面的bash脚本会设置一切:

 #进入根目录
cd path / to /根目录
#输入数据目录
cd My-Project-Data
#为新版本创建一个目录并输入它
mkdir v1.3
cd v1.3
#将新的sqlite文件复制到它
cp〜/ path / to / data / file.sqlite Data.sqlite
#移动到项目目录
cd ../../My- Project
#创建一个新的分支
git checkout -b version-1.3
#移动到源代码目录并删除当前的符号链接
cd源代码和资料
rm Data-Symlink.sqlite
#为新数据文件
ln -s ../../Project-Data/v1.3/Data.sqlite数据符号链接创建符号链接。 sqlite
#提交更改
cd ../
git add Source-Code-and-Stuff / Data-Symlink.sqlite
git commit -m更新符号链接



结论



显然这不是完美解决方案如果您正在与一个团队合作,团队中的每个人都需要具有相同的相关目录 - 符号链接是相对路径,所以根目录的绝对路径可以更改,但是我的-Project My-Project-Data 必须存在于其中。但我个人的观点是,这些好处超过了这个小小的警告。在实际的项目中,我正在使用这种技术,我有一个800MB的数据sqlite文件,并且能够在实时和开发分支之间切换,并让我的项目自动更新数据文件是无价的。


I'm working on a project that involves the use of very sensitive data, and I've been instructed to only transmit this data online via a custom file transfer system. The project itself is under git source control and includes a sqlite file containing the sensitive data.

Up to this point, I've simply been ignoring the sqlite file via the gitignore file, which prevents it from ever being pushed to the remote repository. However, I've now reached a point in the project where we have a live version as well as a development version, and the fact that the data is not being tracked locally is making using branches very difficult.

So my question is: is there a way for me to keep track of the sqlite file locally, so I can have different data versions on different branches, but never have it pushed to the remote repository?

After reading this question, I considered having local-only development branches that use different gitignore files, but the fact that a git merge into the remotely shared branches would also merge changes to the gitignore file would quickly become cumbersome.

解决方案

Ok, so I actually came up with a better solution to this problem. My previous solution, which involved a second git repository, quickly became problematic due to the size of the sqlite files I was working with; git cannot handle large files. I investigated various ways to improve git's ability to handle the files (e.g. git-bigfiles, git-annex) but nothing seemed to handle my situation elegantly.

The answer: symlinks.

N.B. This solution is pretty Unix specific, but you will probably be able to rework it for non-Unix systems.

Problem #1: Ensure that the data is never sent to the remote repository.

This one was easy. Similar to my previous solution, I store the data outside of the repository.

Root-Directory/
    My-Project/
        .git/
        Source-Code-and-Stuff/
    My-Project-Data/
        A-Big-Sqlite-File.sqlite

Because the data files aren't in the repository, there's no need to worry about them being indexed by git.

Problem #2: Different branches should reference different versions of the data.

This is where symlinks come into play. A symlink is effectively a shortcut to a file, so the idea is to put a symlink to the data file inside the repository. Symlinks are indexed by git (and they're very small), so different branches can have different symlinks.

To explain this, let's take an example project, which has a currently live version (1.1) on the master branch; and a new version (1.2) on the version-1.2 branch. For simplicity's sake, this project only has one data file: Data.sqlite.

The data file is stored inside the My-Project-Data directory mentioned above, and versioned on the filesystem like so:

My-Project-Data/
    v1.1/
        Data.sqlite
    v1.2/
        Data.sqlite

The data file is added to the repository by using a symlink:

My-Project/
    .git/
    Source-Code-and-Stuff/
        Data-Symlink.sqlite

On the master branch, Data-Symlink.sqlite is

../../My-Project-Data/v1.1/Data.sqlite

and on the version-1.2 branch it is

../../My-Project-Data/v1.2/Data.sqlite

So when development on version 1.3 begins, the following bash script will set everything up:

# Get to the root directory
cd path/to/Root-Directory
# Enter the data directory
cd My-Project-Data
# Make a directory for the new version and enter it
mkdir v1.3
cd v1.3
# Copy the new sqlite file into it
cp ~/path/to/data/file.sqlite Data.sqlite
# Move to the project directory
cd ../../My-Project
# Create a new branch
git checkout -b version-1.3
# Move to the source code directory and delete the current symlink
cd Source-Code-and-Stuff
rm Data-Symlink.sqlite
# Create a symlink to the new data file
ln -s ../../Project-Data/v1.3/Data.sqlite Data-Symlink.sqlite
# Commit the change
cd ../
git add Source-Code-and-Stuff/Data-Symlink.sqlite
git commit -m "Update the symlink"

Conclusion

Obviously this isn't a perfect solution. If you're working with a team, everyone on the team will need to have the same relative directories - symlinks are relative paths, so the absolute path to Root-Directory can change, but My-Project and My-Project-Data must exist within it. But my personal opinion is that the benefits outweigh this minor caveat. In the actual project I'm using this technique with I have an 800MB sqlite file for the data, and being able to switch between live and development branches and have my project automatically update the data file is priceless.

这篇关于在本地跟踪文件,但绝不允许将它们推送到远程存储库的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆