如何确定Git是否将文件处理为二进制文件或文本? [英] How to determine if Git handles a file as binary or as text?
问题描述
我知道GIT以某种方式自动检测文件是二进制文件还是文本文件,并且如果需要,可以使用gitattributes手动设置。但是,还有一种方法可以问GIT如何处理文件吗?
然后,假设我有一个包含两个文件的GIT存储库:a ascii。包含纯文本和包含随机二进制内容的 binary.dat 文件。 Git将第一个dat文件处理为文本,将第二个文件处理为二进制文件。现在我想写一个GIT webfrontend,它有一个文本文件查看器和一个二进制文件的特殊查看器(例如显示一个十六进制转储文件)。当然,我可以实现自己的文本/二进制检查,但如果观众依赖GIT处理这些文件的信息,它会更有用。所以我该如何问GIT是否将文件视为文本或二进制文件? 解决方案 builtin_diff()
1 调用 diff_filespec_is_binary()
,它调用 buffer_is_binary()
,它检查一个零字节(NUL字符)在前8000个字节(或整个长度,如果更短)。
我没有看到这个是二进制吗?测试是明确暴露的在任何命令中都可以使用。
git merge-file
直接使用 buffer_is_binary / code>,所以你可以使用它:
git merge-file / dev / null / dev / null文件测试
它似乎会产生错误消息,如 error:无法合并二进制文件:file-to-test
,并在给定二进制文件时产生退出状态255。虽然我不确定我是否会依赖这种行为。
也许 git diff --numstat
会是更可靠:
isBinary(){
p = $(printf'%s\t-\t' - )
t = $(git diff --no-index --numstat / dev / null$ 1)
case$ tin$ p*)return 0 ;; esac
return 1
}
是二进制文件测试&&回声二进制|| echo二进制
对于二进制文件, - numstat
输出应该以 -
TAB -
TAB开头,所以我们只是测试一下。
<1>
builtin_diff()
has字符串像二进制文件%s和%s不同
应该很熟悉。
I know that GIT somehow automatically detects if a file is binary or text and that gitattributes can be used to set this manually if needed. But is there also a way to ask GIT how it treats a file?
So let's say I have a GIT repository with two files in it: A ascii.dat file containing plain-text and a binary.dat file containing random binary stuff. Git handles the first dat file as text and the secondary file as binary. Now I want to write a GIT webfrontend which has a viewer for text files and a special viewer for binary files (Displaying a Hex dump for example). Sure, I could implement my own text/binary check but it would be more useful if the viewer relies on the information how GIT handles these files.
So how can I ask GIT if it treats a file as text or binary?
builtin_diff()
1 calls diff_filespec_is_binary()
which calls buffer_is_binary()
which checks for any occurrence of a zero byte (NUL "character") in the first 8000 bytes (or the entire length if shorter).
I do not see that this "is it binary?" test is explicitly exposed in any command though.
git merge-file
directly uses buffer_is_binary()
, so you may be able to make use of it:
git merge-file /dev/null /dev/null file-to-test
It seems to produce the error message like error: Cannot merge binary files: file-to-test
and yields an exit status of 255 when given a binary file. I am not sure I would want to rely on this behavior though.
Maybe git diff --numstat
would be more reliable:
isBinary() {
p=$(printf '%s\t-\t' -)
t=$(git diff --no-index --numstat /dev/null "$1")
case "$t" in "$p"*) return 0 ;; esac
return 1
}
isBinary file-to-test && echo binary || echo not binary
For binary files, the --numstat
output should start with -
TAB -
TAB, so we just test for that.
1
builtin_diff()
has strings like Binary files %s and %s differ
that should be familiar.
这篇关于如何确定Git是否将文件处理为二进制文件或文本?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!