Linux上.NET Core的字符编码错误 [英] Character encoding errors with .NET Core on Linux

查看:378
本文介绍了Linux上.NET Core的字符编码错误的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

好几天以来,这一直困扰着我,我终于把它归结为一个简单的,可重复的问题。

This has been driving me batty for days, and I've finally got it down to a simple, reproducible issue.

我有一个NUnit测试项目,该项目是.NET Core 2.1。它引用的是.NET Standard 2.0库(我们称之为核心)。

I have a NUnit test project, which is .NET Core 2.1. It references a library (let's call it "Core") which is .NET Standard 2.0.

在我的测试项目中:

[TestCase(true, false)]
[TestCase(false, false)]
[TestCase(false, true)]
public void ShouldStartWith(bool useInternal, bool passStartsWith)
{
    var result = useInternal ? StartsWithQ("¿Que?") : StringUtilities.StartsWithQ("¿Que?", passStartsWith ? "¿" : null);
    result.ShouldBeTrue();
}

public static bool StartsWithQ(string s)
{
    return _q.Any(q => s.StartsWith(q, StringComparison.InvariantCultureIgnoreCase));
}

核心 StringUtilities 类中的项目:

public static bool StartsWithQ(string s, string startsWith = null)
{
    return startsWith == null
        ? _q.Any(q => s.StartsWith(q, StringComparison.InvariantCultureIgnoreCase))
        : s.StartsWith(startsWith, StringComparison.InvariantCultureIgnoreCase);
}

两个类都定义了特殊字符列表:

Both classes have defined a list of special characters:

private static readonly List<string> _q = new List<string>
{
    "¡",
    "¿"
};

在Windows环境中,所有测试用例均通过。但是,当相同的测试在Linux环境中运行时,测试用例 ShouldStartWith(False,False)会失败!

In a Windows environment, all test cases pass. But when the same tests run in the Linux environment, the test case ShouldStartWith(False,False) fails!

这意味着当测试项目中的所有内容都在运行时,字符串比较可以正常工作,即使将特殊字符传递给 StringUtilities 方法,该比较也可以正常工作。但是,当您与在Core项目中编译的字符串进行比较时,特殊字符不再等效!

That means that when everything is running in the test project, the string comparison works correctly, and even if you pass the special characters to the StringUtilities method, the comparison works. But when you compare to a string that was compiled in the Core project, the special characters are no longer equivalent!

任何人都知道为什么会这样吗?这是.NET错误吗?

Anyone know why this is? Is this a .NET bug? How to work around it?

推荐答案

源文件的编码很可能彼此不匹配和/或不匹配。

The encodings of your source files most likely don't match each other and/or not the compiler settings.

示例:

包含的源文件public void ShouldStartWith(bool useInternal ,bool passStartsWith)可以使用utf-8进行编码,而带有列表的源文件则使用Latin-1(或类似的东西)进行编码。

The sourcefile containing public void ShouldStartWith(bool useInternal, bool passStartsWith) may be encoded using utf-8 while the source file with the list is encoded in Latin-1 (or something like that).

当我们通过以下方式玩耍时:

When we play this through:


  • ¿将为: 0xC2 0xBF

  • ¿将是: 0xBF

  • The utf-8 representation of ¿ would be: 0xC2 0xBF.
  • The Latin-1 representation of ¿ would be: 0xBF.

因此,当编译器解释您的源代码时文件为Latin-1,则在保存为utf-8的情况下,他将读取2个字节(根据Latin-1,该字符也为2个字符),因此无法匹配字符串。

Thus, when the compiler interprets your source files as Latin-1, then he will read 2 bytes in the case of the utf-8 saved file (and according to Latin-1 also 2 chars) and therefore fails to match the strings.

如评论中所述:克服此问题的最佳方法是

As already stated in the comments: The best way to overcome this is to encode the source files in the encoding the compiler awaits.

将操作系统排除为错误源的另一种方法:复制已编译的项目(dll是-不要重新编译从一个操作系统到另一个操作系统),并执行代码。您应该在两个操作系统上看到具有相同二进制编译器输出的相同行为。

Another way to exclude the operating system as error source: Copy the compiled project (the dll's - don't recompile the source on the other operating system) from one operating system to the other and execute the code. You should see the same behaviour on both operating systems with the same binary compiler output.

这篇关于Linux上.NET Core的字符编码错误的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆