转换git仓库文件编码 [英] Convert git repository file encoding
问题描述
我有一个包含 ISO-8859-1
文件的大型 CVS 存储库,我想将其转换为 git.
I have a large CVS repository containing files in ISO-8859-1
and want to convert this to git.
当然,我可以将 git 配置为使用 ISO-8859-1
进行编码,但我希望将其放在 utf8
中.
Sure I can configure git to use ISO-8859-1
for encoding, but I would like to have it in utf8
.
现在使用诸如 iconv
或 recode
之类的工具,我可以转换工作树中文件的编码.我可以用像 converted encoding
这样的消息来提交这个.
Now with tools such as iconv
or recode
I can convert the encoding for the files in my working tree. I could commit this with a message like converted encoding
.
我现在的问题是,是否有可能转换完整的历史记录?从 cvs 转换为 git 时或之后.我的想法是编写一个脚本来读取 git 存储库中的每个提交并将其转换为 utf8
并提交到一个新的 git 存储库中.
My question now is, is there a possibility to convert the complete history? Either when converting from cvs to git or afterwards. My idea would be to write a script that reads each commit in the git repository and to convert it to utf8
and to commit it in a new git repository.
这可能吗(我不确定哈希码以及如何遍历提交、分支和标签).或者有什么工具可以处理这样的事情?
Is this possible (I am unsure about the hash codes and how to walk through the commits, branches and tags). Or is there a tool that can handle something like this?
推荐答案
你可以用 git filter-branch
做到这一点.这个想法是你必须在每次提交中更改文件的编码,并在每次提交时重写.
You can do this with git filter-branch
. The idea is that you have to change the encoding of the files in every commit, rewriting each commit as you go.
首先,编写一个脚本来更改存储库中每个文件的编码.它可能看起来像这样:
First, write a script that changes the encoding of every file in the repository. It could look like this:
#!/bin/sh
find . -type f -print | while read f; do
mv -i "$f" "$f.recode.$$"
iconv -f iso-8859-1 -t utf-8 < "$f.recode.$$" > "$f"
rm -f "$f.recode.$$"
done
然后使用 git filter-branch
一遍又一遍地运行这个脚本,每次提交一次:
Then use git filter-branch
to run this script over and over again, once per commit:
git filter-branch --tree-filter /tmp/recode-all-files HEAD
其中 /tmp/recode-all-files
是上面的脚本.
where /tmp/recode-all-files
is the above script.
存储库刚从 CVS 升级后,您可能在 git 中只有一个分支,其线性历史可以追溯到开始.如果您有多个分支,则可能需要增强 git filter-branch
命令以编辑所有提交.
Right after the repository is freshly upgraded from CVS, you probably have just one branch in git with a linear history back to the beginning. If you have several branches, you may need to enhance the git filter-branch
command to edit all the commits.
这篇关于转换git仓库文件编码的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!