Git可以在UTF-8终端中正确显示ISO Latin 1口音吗? [英] Could Git correctly display ISO Latin 1 accents in a UTF-8 terminal?
问题描述
来自在我的MinTTY(在Windows中为Cygwin)中,git grep
显示奇怪的字符而不是重音符号:
In my MinTTY (Cygwin on Windows), git grep
display weird chars instead of accents:
通过验证后,文件类型似乎为:
Upon verification, it seems that the filetype is:
ISO-8859 text, with very long lines, with CRLF line terminators
我的MinTTY设置为UTF-8时:
While my MinTTY is set up as UTF-8:
# Text
Font=Powerline Consolas
FontHeight=9
BoldAsFont=yes
BoldAsColour=yes
AllowBlinking=yes
Locale=C
Charset=UTF-8
# Terminal
Term=xterm-256color
当然,当从不同的存储库中提取文件时,我们永远都不知道它是哪种编码.
Of course, when grepping in files from different repos, we never know in which encoding it is.
Git Grep是否有更好的表现方式?
Is there a way for Git Grep to behave better?
PS-(旁边的问题)这些配音的颜色规格是什么(这里以黄色显示为蓝色)?
PS- (Side question) What's the color spec for those accents (here displayed in yellow on blue)?
推荐答案
git grep
与grep
非常相似,它显示文件的内容,就像在工作树中一样,无需任何转换.但是,与grep
不同,它将通过更少的管道进行传递.较少使用您的环境进行语言环境设置(例如LC_*
选项),它将相应地呈现数据.
git grep
, much like grep
, displays the contents of the file as it would be in the working tree without any transformation. Unlike grep
, though, it will pipe it through less. less honors your environment for locale settings (e.g., the LC_*
options), and it will render data accordingly.
如果您的环境报告的是UTF-8,并且您有非UTF-8数据,则less
会将其编码为您所看到的,因为通常情况下,替代方法要么是替换字符,要么什么都不是,这不是在查看二进制文件时非常有用.
If your environment is reporting UTF-8 and you have non-UTF-8 data, less
will encode it as you're seeing here, since usually the alternative is either a replacement character or nothing, which isn't very useful when viewing binary files.
由于less
不知道使用的是哪种编码,并且不同的编码会将同一字节映射到不同的Unicode字符,从而映射到不同的UTF-8序列,因此无法自动进行编码转换. less
甚至不知道文件是文本文件还是二进制文件. file
猜测文件中是哪种文本,但不确定,通常情况下,区分单字节编码需要广泛的语言知识.
Since less
has no clue what encoding is being used and different encodings will map that same byte to different Unicode characters and hence different UTF-8 sequences, there's no way for it to be automatically converted. less
doesn't even know if the file is text or binary. file
makes a guess about what kind of text is in the file, but it doesn't know for certain, and in the general case distinguishing between single-byte encodings requires extensive linguistic knowledge.
因此,您的答案是不,在通常情况下,这是不可能的.
So your answer is, no, in the general case, this is not possible.
这篇关于Git可以在UTF-8终端中正确显示ISO Latin 1口音吗?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!