Git可以在UTF-8终端中正确显示ISO Latin 1口音吗? [英] Could Git correctly display ISO Latin 1 accents in a UTF-8 terminal?

查看:138
本文介绍了Git可以在UTF-8终端中正确显示ISO Latin 1口音吗?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

来自在我的MinTTY(在Windows中为Cygwin)中,git grep显示奇怪的字符而不是重音符号:

In my MinTTY (Cygwin on Windows), git grep display weird chars instead of accents:

通过验证后,文件类型似乎为:

Upon verification, it seems that the filetype is:

ISO-8859 text, with very long lines, with CRLF line terminators

我的MinTTY设置为UTF-8时:

While my MinTTY is set up as UTF-8:

# Text
Font=Powerline Consolas
FontHeight=9
BoldAsFont=yes
BoldAsColour=yes
AllowBlinking=yes
Locale=C
Charset=UTF-8

# Terminal
Term=xterm-256color

当然,当从不同的存储库中提取文件时,我们永远都不知道它是哪种编码.

Of course, when grepping in files from different repos, we never know in which encoding it is.

Git Grep是否有更好的表现方式?

Is there a way for Git Grep to behave better?

PS-(旁边的问题)这些配音的颜色规格是什么(这里以黄色显示为蓝色)?

PS- (Side question) What's the color spec for those accents (here displayed in yellow on blue)?

推荐答案

git grepgrep非常相似,它显示文件的内容,就像在工作树中一样,无需任何转换.但是,与grep不同,它将通过更少的管道进行传递.较少使用您的环境进行语言环境设置(例如LC_*选项),它将相应地呈现数据.

git grep, much like grep, displays the contents of the file as it would be in the working tree without any transformation. Unlike grep, though, it will pipe it through less. less honors your environment for locale settings (e.g., the LC_* options), and it will render data accordingly.

如果您的环境报告的是UTF-8,并且您有非UTF-8数据,则less会将其编码为您所看到的,因为通常情况下,替代方法要么是替换字符,要么什么都不是,这不是在查看二进制文件时非常有用.

If your environment is reporting UTF-8 and you have non-UTF-8 data, less will encode it as you're seeing here, since usually the alternative is either a replacement character or nothing, which isn't very useful when viewing binary files.

由于less 不知道使用的是哪种编码,并且不同的编码会将同一字节映射到不同的Unicode字符,从而映射到不同的UTF-8序列,因此无法自动进行编码转换. less甚至不知道文件是文本文件还是二进制文件. file猜测文件中是哪种文本,但不确定,通常情况下,区分单字节编码需要广泛的语言知识.

Since less has no clue what encoding is being used and different encodings will map that same byte to different Unicode characters and hence different UTF-8 sequences, there's no way for it to be automatically converted. less doesn't even know if the file is text or binary. file makes a guess about what kind of text is in the file, but it doesn't know for certain, and in the general case distinguishing between single-byte encodings requires extensive linguistic knowledge.

因此,您的答案是不,在通常情况下,这是不可能的.

So your answer is, no, in the general case, this is not possible.

这篇关于Git可以在UTF-8终端中正确显示ISO Latin 1口音吗?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆