转换git存储库文件编码 [英] Convert git repository file encoding
问题描述
我有一个大的CVS存储库,其中包含 ISO-8859-1
中的文件,并希望将其转换为git。
当然,我可以配置git使用 ISO-8859-1
进行编码,但我想在 utf8
。
现在使用 iconv
或 recode
我可以转换我的工作树中的文件的编码。我可以使用转换的编码
utf8
并提交它在一个新的git存储库。 这是可能的(我不确定哈希码,以及如何遍历提交,分支和标签)。或者有一个工具可以处理这样的东西?
可以通过 git filter -branch
。这个想法是,你必须在每个提交中更改文件的编码,在每次提交时重写每个提交。
首先,编写一个脚本,文件。它可能如下所示:
#!/ bin / sh
find。 -type f -print |同时读f; do
mv -i$ f$ f.recode。$$
iconv -f iso-8859-1 -t utf-8< $ f.recode。$$> $ f
rm -f$ f.recode。$$
done
$ b b
然后使用 git filter-branch
重复运行此脚本,每次提交一次:
git filter-branch --tree-filter / tmp / recode-all-files HEAD
其中 / tmp / recode-all-files
是上述脚本。
右在仓库从CVS新鲜升级后,你可能只有一个分支在git与线性历史回到开始。如果你有几个分支,你可能需要增强 git filter-branch
命令来编辑所有提交。
I have a large CVS repository containing files in ISO-8859-1
and want to convert this to git.
Sure I can configure git to use ISO-8859-1
for encoding, but I would like to have it in utf8
.
Now with tools such as iconv
or recode
I can convert the encoding for the files in my working tree. I could commit this with a message like converted encoding
.
My question now is, is there a possibility to convert the complete history? Either when converting from cvs to git or afterwards. My idea would be to write a script that reads each commit in the git repository and to convert it to utf8
and to commit it in a new git repository.
Is this possible (I am unsure about the hash codes and how to walk through the commits, branches and tags). Or is there a tool that can handle something like this?
You can do this with git filter-branch
. The idea is that you have to change the encoding of the files in every commit, rewriting each commit as you go.
First, write a script that changes the encoding of every file in the repository. It could look like this:
#!/bin/sh
find . -type f -print | while read f; do
mv -i "$f" "$f.recode.$$"
iconv -f iso-8859-1 -t utf-8 < "$f.recode.$$" > "$f"
rm -f "$f.recode.$$"
done
Then use git filter-branch
to run this script over and over again, once per commit:
git filter-branch --tree-filter /tmp/recode-all-files HEAD
where /tmp/recode-all-files
is the above script.
Right after the repository is freshly upgraded from CVS, you probably have just one branch in git with a linear history back to the beginning. If you have several branches, you may need to enhance the git filter-branch
command to edit all the commits.
这篇关于转换git存储库文件编码的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!