如何确定Git是否将文件处理为二进制文件或文本? [英] How to determine if Git handles a file as binary or as text?

查看:325
本文介绍了如何确定Git是否将文件处理为二进制文件或文本?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我知道GIT以某种方式自动检测文件是二进制文件还是文本文件,并且如果需要,可以使用gitattributes手动设置。但是,还有一种方法可以问GIT如何处理文件吗?

然后,假设我有一个包含两个文件的GIT存储库:a ascii。包含纯文本和包含随机二进制内容的 binary.dat 文件。 Git将第一个dat文件处理为文本,将第二个文件处理为二进制文件。现在我想写一个GIT webfrontend,它有一个文本文件查看器和一个二进制文件的特殊查看器(例如显示一个十六进制转储文件)。当然,我可以实现自己的文本/二进制检查,但如果观众依赖GIT处理这些文件的信息,它会更有用。

所以我该如何问GIT是否将文件视为文本或二进制文件? 解决方案

builtin_diff() 1 调用 diff_filespec_is_binary() ,它调用 buffer_is_binary() ,它检查一个零字节(NUL字符)在前8000个字节(或整个长度,如果更短)。

我没有看到这个是二进制吗?测试是明确暴露的在任何命令中都可以使用。

git merge-file 直接使用 buffer_is_binary / code>,所以你可以使用它:

  git merge-file / dev / null / dev / null文件测试

它似乎会产生错误消息,如 error:无法合并二进制文件:file-to-test ,并在给定二进制文件时产生退出状态255。虽然我不确定我是否会依赖这种行为。



也许 git diff --numstat 会是更可靠:

  isBinary(){
p = $(printf'%s\t-\t' - )
t = $(git diff --no-index --numstat / dev / null$ 1)
case$ tin$ p*)return 0 ;; esac
return 1
}
是二进制文件测试&&回声二进制|| echo二进制

对于二进制文件, - numstat 输出应该以 - TAB - TAB开头,所以我们只是测试一下。






<1>
builtin_diff() has字符串像二进制文件%s和%s不同应该很熟悉。


I know that GIT somehow automatically detects if a file is binary or text and that gitattributes can be used to set this manually if needed. But is there also a way to ask GIT how it treats a file?

So let's say I have a GIT repository with two files in it: A ascii.dat file containing plain-text and a binary.dat file containing random binary stuff. Git handles the first dat file as text and the secondary file as binary. Now I want to write a GIT webfrontend which has a viewer for text files and a special viewer for binary files (Displaying a Hex dump for example). Sure, I could implement my own text/binary check but it would be more useful if the viewer relies on the information how GIT handles these files.

So how can I ask GIT if it treats a file as text or binary?

解决方案

builtin_diff()1 calls diff_filespec_is_binary() which calls buffer_is_binary() which checks for any occurrence of a zero byte (NUL "character") in the first 8000 bytes (or the entire length if shorter).

I do not see that this "is it binary?" test is explicitly exposed in any command though.

git merge-file directly uses buffer_is_binary(), so you may be able to make use of it:

git merge-file /dev/null /dev/null file-to-test

It seems to produce the error message like error: Cannot merge binary files: file-to-test and yields an exit status of 255 when given a binary file. I am not sure I would want to rely on this behavior though.

Maybe git diff --numstat would be more reliable:

isBinary() {
    p=$(printf '%s\t-\t' -)
    t=$(git diff --no-index --numstat /dev/null "$1")
    case "$t" in "$p"*) return 0 ;; esac
    return 1
}
isBinary file-to-test && echo binary || echo not binary

For binary files, the --numstat output should start with - TAB - TAB, so we just test for that.


1 builtin_diff() has strings like Binary files %s and %s differ that should be familiar.

这篇关于如何确定Git是否将文件处理为二进制文件或文本?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆