Git在Linux容器上弄乱了非ASCII字符 [英] Git messes up with non-ascii characters on Linux container

查看:72
本文介绍了Git在Linux容器上弄乱了非ASCII字符的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个.Net Core(C#)项目,其中一个类包含以下行:

I have a .Net Core (C#) project with the following line in one of the classes:

var input =£ ;

但是当我在Docker容器中进行git克隆时( microsoft / dotnet:2.2-sdk )将其弄乱并显示为(在 bash 中使用 cat )。

But when I do a git clone in a Docker container (microsoft/dotnet:2.2-sdk) it messes it up and displays it as (in bash using cat).

当我运行它时,它的 Utf-8 个字节是 [239、191、189] = [EF,BF,BD] ,这似乎就是所谓的 Unicode替换字符

And when I run it, its Utf-8 bytes are [239, 191, 189] = [EF, BF, BD] which seem to be a so-called Unicode replacement character.

我使用的Windows编辑器是VS 2017,但是字符可以在其他Windows机器上正确显示,并可以通过 dotnet run / test 命令正确解析,因此我认为这不是无法正确保存字符的问题。

Windows editor that I use is VS 2017, but character is displayed properly on other windows machines and parsed properly by dotnet run/test command, so I don't think this is a problem of failing to save the character incorrectly.

有什么想法让我看到这样的混乱以及如何解决?

Any ideas why I am seeing such a mess and how to solve it?

一些细节


  • 我使用 Encoding.UTF8.GetBytes(£);

  • 它在 Windows 10 机器上完美运行

  • Linux版本 Debian GNU / Linux 9(拉伸) cat / etc / os-release

  • locale -a 返回 C C.UTF-8 POSIX

  • 在Windows Notepad ++上,打开后声称是ANSI,并且显示正确。

  • I get bytes using Encoding.UTF8.GetBytes("£");
  • It works perfectly well on Windows 10 machine
  • Linux version Debian GNU/Linux 9 (stretch) from the cat /etc/os-release
  • locale -a returns C C.UTF-8 POSIX
  • On Windows Notepad++, when opened, is claims to be ANSI and is displayed correctly.

运行 fgrep'var input'file.cs | od -tx1 -c

0000100  76  61  72  20  69  6e  70  75  74  20  3d  20  22  a3  22  3b
          v   a   r       i   n   p   u   t       =       " 243   "   ;


推荐答案

您的文件包含单个字节 a3 对应于Windows-1252字符£的编码。您的Linux系统显示,因为它不是有效的UTF-8编码。

Your file contains a single byte a3 which corresponds to the Windows-1252 encoding for the character £. Your Linux system displays because it is not a valid UTF-8 encoding.

您应将Visual Studio配置为使用UTF-8而不是Windows-1252。

You should configure Visual Studio to use UTF-8 instead of Windows-1252.

这篇关于Git在Linux容器上弄乱了非ASCII字符的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆