如何在结帐时停止破解编码的git [英] How to stop git from breaking encoding on checkout
问题描述
我最近使用以下设置将一个.gitattributes文件添加到了ac#存储库中:
* text = auto
* .cs text diff = csharp
我将存储库重新归一化遵循github的这些指示,它似乎可以正常工作。
我遇到的问题是当我签出一些文件(不是所有文件)时,我看到很多奇怪的字符与实际代码混合在一起。当git通过上面的.gitattributes文件指定的 lf-> crlf
转换运行文件时,似乎会发生。
根据Notepad ++的说法,搞乱的文件是使用
UCS-2 Little Endian
或 UCS-2 Big Endian
编码。看起来工作正常的文件是 ANSI
或 UTF-8
编码。 参考我的git版本是 1.8.0.msysgit.0
,我的操作系统是Windows 8。
任何想法如何解决这个问题?如果你使用一种编码,其中每个字符是两个字节,那么会发生这种情况。
CRLF将被编码为
\0\r\0\\\
。 Git认为这是一个单字节编码,所以它变成了
\\\\\\\\
。这使得下一行关闭一个字节,导致每一行都充满中文。 (因为
\ 0
变成低位字节而不是高位字节) 你可以使用此LINQPad脚本将文件转换为UTF8:
const string path = @C:\ ...;
foreach(Directory.EnumerateFiles(path,*,SearchOption.AllDirectories)中的var文件)
{
if(!new [] {.html,.js} .Contains(Path.GetExtension(file)))
continue;
File.WriteAllText(file,String.Join(\r\\\
,File.ReadAllLines(file)),new UTF8Encoding(encoderShouldEmitUTF8Identifier:true));
file.dump();
}
这不会修复损坏的文件;您可以在十六进制编辑器中用 \\\
替换
\r\\\
来修复这些文件。我没有LINQPad脚本。 (因为没有简单的
Replace()
方法用于 byte []
s)
I recently added a .gitattributes file to a c# repository with the following settings:
* text=auto
*.cs text diff=csharp
I renormalized the repository following these instructions from github and it seemed to work OK.
The problem I have is when I checkout some files (not all of them) I see lots of weird characters mixed in with the actual code. It seems to happen when git runs the files through the lf->crlf
conversion specified by the .gitattributes file above.
According to Notepad++ the files that get messed up are using UCS-2 Little Endian
or UCS-2 Big Endian
encoding. The files that seem to work OK are either ANSI
or UTF-8
encoded.
For reference my git version is 1.8.0.msysgit.0
and my OS is Windows 8.
Any ideas how I can fix this? Would changing the encoding of the files be enough?
This happens if you use an encoding where every character is two bytes.
CRLF would then be encoded as \0\r\0\n
.
Git thinks it's a single-byte encoding, so it turns that into \0\r\0\r\n
.
This makes the next line one byte off, causing every other line be full of Chinese. (because the \0
becomes the low-order byte rather than the high-order byte)
You can convert files to UTF8 using this LINQPad script:
const string path = @"C:\...";
foreach (var file in Directory.EnumerateFiles(path, "*", SearchOption.AllDirectories))
{
if (!new [] { ".html", ".js"}.Contains(Path.GetExtension(file)))
continue;
File.WriteAllText(file, String.Join("\r\n", File.ReadAllLines(file)), new UTF8Encoding(encoderShouldEmitUTF8Identifier: true));
file.Dump();
}
This will not fix broken files; you can fix the files by replacing \r\n
with \n
in a hex editor. I don't have a LINQPad script for that. (since there's no simple Replace()
method for byte[]
s)
这篇关于如何在结帐时停止破解编码的git的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!