StringComparer和Equals/==为编码的字符串产生不同的结果 [英] StringComparer and Equals/== producing different results for encoded strings

查看：176 发布时间：2019/6/19 3:10:27 netfxbcl

本文介绍了StringComparer和Equals/==为编码的字符串产生不同的结果的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我创建了以下代码段(对于编程课程，因此请忽略它不是特别有用):

string input ="Hello World";
byte [] data = Encoding. UTF32.GetBytes(input);
字符串垃圾= Encoding.UTF8.GetString(data);
//垃圾现在包含11 * 4 = 44个字符，其中33个是\ 0's
/此测试通过
Debug.Assert(input！=垃圾);

//我们期望比较产生相同的结果，即非零结果

//cultureCompare为0！

//但ordinalCompare是101
int ordinalCompare = StringComparer.Ordinal.Compare(input，垃圾);

这是框架错误吗?如果不是，则至少是不一致的行为，也没有记录的行为.

解决方案

Morten Mertner，

我可以在一个简单的C#控制台应用程序的测试机上重现此问题. CurrentCulture和InvariantCulture属性使用当前/不变区域性的单词比较规则，但是，Ordinal属性不使用单词比较规则，因为这是非语言字符串比较.以下两篇重要的文章可以帮助您了解此问题:

1.从MSDN:新建议Microsoft .NET 2.0中使用字符串的方法

DO:使用 StringComparison.Ordinal 或 OrdinalIgnoreCase 进行比较，作为与文化无关的字符串匹配的安全默认值.
DO:使用 StringComparison.Ordinal 和 OrdinalIgnoreCase 比较可以提高速度.
DO:在向用户显示输出时，请使用基于 StringComparison.CurrentCulture 的字符串操作.
DO:根据不变文化切换当前使用的字符串操作，以使用非语言的 StringComparison.Ordinal 或 StringComparison .OrdinalIgnoreCase ，如果比较在语言上不相关(例如符号).
DO:在标准化字符串进行比较时，请使用 ToUpperInvariant 而不是 ToLowerInvariant .
不要:对没有显式或隐式指定字符串比较机制的字符串操作使用重载.
不要:在大多数情况下，请使用基于 StringComparison.InvariantCulture 的字符串操作；少数例外之一是保留具有语言学意义但与文化无关的数据.

2.从BCL博客: <身体> 数据含义
数据行为
对应StringComparsion

值

非语言标识符，其中字节完全匹配.
序号

非语言标识符，大小写无关，尤其是存储在大多数Microsoft Windows系统服务中的一条数据.
OrdinalIgnoreCase

与文化无关的数据，它在语言上仍然相关.
不变文化

或

InvariantCultureIgnoreCase

需要本地数据的数据语言习俗.
CurrentCulture

或

CurrentCultureIgnoreCase

希望有帮助.

I created the following code snippet (for a programming course, so please ignore that it's not particularly useful):

string input = "Hello World";
byte[] data = Encoding.UTF32.GetBytes( input );
string garbage = Encoding.UTF8.GetString( data );
// garbage now contains 11*4 = 44 characters, of which 33 are \0's

// this test passes
Debug.Assert( input != garbage );

// we expect comparisons to produce the same result, that is, a non-zero result

// cultureCompare is 0!
int cultureCompare = StringComparer.CurrentCulture.Compare( input, garbage );

// invariantCompare is 0!
int invariantCompare = StringComparer.InvariantCulture.Compare( input, garbage );

// but ordinalCompare is 101
int ordinalCompare = StringComparer.Ordinal.Compare( input, garbage );

Is this a framework bug? If it isn't it's at least inconsistent and also undocumented behavior.

解决方案

Morten Mertner,

I can reproduce this issue on my test machine in a simple C# console application. CurrentCulture and InvariantCulture properties using the word comparison rules of the current/invariant culture, however, the Ordinal property don't use the word comparison rules because this is a non-linguistic string comparison. The following two important articles can help you to understand this issue:

1. From MSDN: New Recommendations for Using Strings in Microsoft .NET 2.0

DO: Use StringComparison.Ordinal or OrdinalIgnoreCase for comparisons as your safe default for culture-agnostic string matching.
DO: Use StringComparison.Ordinal and OrdinalIgnoreCase comparisons for increased speed.
DO: Use StringComparison.CurrentCulture-based string operations when displaying the output to the user.
DO: Switch current use of string operations based on the invariant culture to use the non-linguistic StringComparison.Ordinal or StringComparison.OrdinalIgnoreCase when the comparison is linguistically irrelevant (symbolic, for example).
DO: Use ToUpperInvariant rather than ToLowerInvariant when normalizing strings for comparison.
DON'T: Use overloads for string operations that don't explicitly or implicitly specify the string comparison mechanism.
DON'T: Use StringComparison.InvariantCulture-based string operations in most cases; one of the few exceptions would be persisting linguistically meaningful but culturally-agnostic data.

2. From BCL Blog: String.Compare() != String.Equals() [Josh Free]

Data meaning

Data behavior

Corresponding StringComparsion

Value

·         Case-sensitive internal identifiers

·         Case sensitive identifiers in standards like XML and HTTP

·         Case sensitive security-related settings

A non-linguistic identifier, where bytes match exactly.

Ordinal

·         Case-insensitive internal identifiers

·         Case-insensitive identifiers in standards like XML and HTTP

·         File paths

·         Registry keys/values

·         Environment variables

·         Resource identifiers (handle names, for example)

·         Case insensitive security related settings

A non-linguistic identifier, where case is irrelevant, especially a piece of data stored in most Microsoft Windows system services.

OrdinalIgnoreCase

·         Some persisted linguistically-relevant data

·         Display of linguistic data requiring a fixed sort order

Culturally-agnostic data, which still is linguistically relevant.

InvariantCulture

or

InvariantCultureIgnoreCase

·         Data displayed to the user

·         Most user input

Data that requires local linguistic customs.

CurrentCulture

or

CurrentCultureIgnoreCase

Hope that helps.

这篇关于StringComparer和Equals/==为编码的字符串产生不同的结果的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

StringComparer和Equals/==为编码的字符串产生不同的结果 [英] StringComparer and Equals/== producing different results for encoded strings

问题描述

相关文章

其他开发语言最新文章

热门教程

热门工具

登录关闭

Data meaning	Data behavior	Corresponding StringComparsion Value
· Case-sensitive internal identifiers · Case sensitive identifiers in standards like XML and HTTP · Case sensitive security-related settings	A non-linguistic identifier, where bytes match exactly.	Ordinal
· Case-insensitive internal identifiers · Case-insensitive identifiers in standards like XML and HTTP · File paths · Registry keys/values · Environment variables · Resource identifiers (handle names, for example) · Case insensitive security related settings	A non-linguistic identifier, where case is irrelevant, especially a piece of data stored in most Microsoft Windows system services.	OrdinalIgnoreCase
· Some persisted linguistically-relevant data · Display of linguistic data requiring a fixed sort order	Culturally-agnostic data, which still is linguistically relevant.	InvariantCulture or InvariantCultureIgnoreCase
· Data displayed to the user · Most user input	Data that requires local linguistic customs.	CurrentCulture or CurrentCultureIgnoreCase

StringComparer和Equals/==为编码的字符串产生不同的结果 [英] StringComparer and Equals/== producing different results for encoded strings

问题描述

相关文章

其他开发语言最新文章

热门教程

热门工具

登录 关闭

登录关闭