Linux 上的 .NET Core 字符编码错误 [英] Character encoding errors with .NET Core on Linux

查看:26
本文介绍了Linux 上的 .NET Core 字符编码错误的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

这几天一直让我发脾气,我终于把它归结为一个简单的、可重现的问题.

This has been driving me batty for days, and I've finally got it down to a simple, reproducible issue.

我有一个 NUnit 测试项目,它是 .NET Core 2.1.它引用了一个库(我们称之为Core"),它是 .NET Standard 2.0.

I have a NUnit test project, which is .NET Core 2.1. It references a library (let's call it "Core") which is .NET Standard 2.0.

在我的测试项目中:

[TestCase(true, false)]
[TestCase(false, false)]
[TestCase(false, true)]
public void ShouldStartWith(bool useInternal, bool passStartsWith)
{
    var result = useInternal ? StartsWithQ("¿Que?") : StringUtilities.StartsWithQ("¿Que?", passStartsWith ? "¿" : null);
    result.ShouldBeTrue();
}

public static bool StartsWithQ(string s)
{
    return _q.Any(q => s.StartsWith(q, StringComparison.InvariantCultureIgnoreCase));
}

StringUtilities 类的 Core 项目中:

and in the Core project in the StringUtilities class:

public static bool StartsWithQ(string s, string startsWith = null)
{
    return startsWith == null
        ? _q.Any(q => s.StartsWith(q, StringComparison.InvariantCultureIgnoreCase))
        : s.StartsWith(startsWith, StringComparison.InvariantCultureIgnoreCase);
}

两个类都定义了一个特殊字符列表:

Both classes have defined a list of special characters:

private static readonly List<string> _q = new List<string>
{
    "¡",
    "¿"
};

在 Windows 环境中,所有测试用例都通过.但是当同样的测试在 Linux 环境中运行时,测试用例 ShouldStartWith(False,False) 失败了!

In a Windows environment, all test cases pass. But when the same tests run in the Linux environment, the test case ShouldStartWith(False,False) fails!

这意味着当测试项目中的一切都在运行时,字符串比较可以正常工作,即使您将特殊字符传递给 StringUtilities 方法,比较也可以正常工作.但是当你对比 Core 项目中编译的字符串时,特殊字符不再等价!

That means that when everything is running in the test project, the string comparison works correctly, and even if you pass the special characters to the StringUtilities method, the comparison works. But when you compare to a string that was compiled in the Core project, the special characters are no longer equivalent!

有人知道这是为什么吗?这是 .NET 错误吗?如何解决?

Anyone know why this is? Is this a .NET bug? How to work around it?

推荐答案

您的源文件的编码很可能彼此不匹配和/或与编译器设置不匹配.

The encodings of your source files most likely don't match each other and/or not the compiler settings.

示例:

包含 public void ShouldStartWith(bool useInternal, bool passStartsWith) 的源文件可以使用 utf-8 编码,而带有列表的源文件使用 Latin-1(或类似的东西)编码.

The sourcefile containing public void ShouldStartWith(bool useInternal, bool passStartsWith) may be encoded using utf-8 while the source file with the list is encoded in Latin-1 (or something like that).

当我们玩这个时:

  • ¿ 的 utf-8 表示将是:0xC2 0xBF.
  • ¿ 的 Latin-1 表示为:0xBF.
  • The utf-8 representation of ¿ would be: 0xC2 0xBF.
  • The Latin-1 representation of ¿ would be: 0xBF.

因此,当编译器将您的源文件解释为 Latin-1 时,在 utf-8 保存文件的情况下,他将读取 2 个字节(根据 Latin-1 也是 2 个字符),因此无法匹配字符串.

Thus, when the compiler interprets your source files as Latin-1, then he will read 2 bytes in the case of the utf-8 saved file (and according to Latin-1 also 2 chars) and therefore fails to match the strings.

如评论中所述:克服此问题的最佳方法是以编译器等待的编码方式对源文件进行编码.

As already stated in the comments: The best way to overcome this is to encode the source files in the encoding the compiler awaits.

另一种排除操作系统作为错误源的方法:将编译后的项目(dll - 不要在另一个操作系统上重新编译源代码)从一个操作系统复制到另一个操作系统并执行代码.您应该在具有相同二进制编译器输出的两个操作系统上看到相同的行为.

Another way to exclude the operating system as error source: Copy the compiled project (the dll's - don't recompile the source on the other operating system) from one operating system to the other and execute the code. You should see the same behaviour on both operating systems with the same binary compiler output.

这篇关于Linux 上的 .NET Core 字符编码错误的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆