排序字母和破折号字符串时,意外的行为 [英] Unexpected behavior when sorting strings with letters and dashes
问题描述
如果我有一个字符串的一些列表包含所有数字和破折号,他们将升序像这样:
If I have some list of strings contain all numbers and dashes they will sort ascending like so:
s = s.OrderBy(t => t).ToList();
66-0616280-000结果
66-0616280-100结果
66-06162801000结果
66-06162801040
66-0616280-000
66-0616280-100
66-06162801000
66-06162801040
这是符合市场预期。
但是,如果字符串包含字母排序有些出人意料。例如,这里是字符串的尾随A的替代0,是的,它的排序相同的列表:
However, if the strings contain letters, the sort is somewhat unexpected. For example, here is the same list of string with trailing A's replacing the 0s, and yes, it is sorted:
66-0616280-00A结果
66-0616280100A结果
66-0616280104A结果
66-0616280-10A
66-0616280-00A
66-0616280100A
66-0616280104A
66-0616280-10A
我本来期望他们进行排序,像这样
I would have expected them to sort like so:
66-0616280-00A结果
66-0616280-10A结果
66-0616280100A结果
66-0616280104A
66-0616280-00A
66-0616280-10A
66-0616280100A
66-0616280104A
为什么这类行为不同的字符串时,它包含的字母与当它仅包含数字?
Why does the sort behave differently on the string when it contains letters vs. when it contains only numbers?
在此先感谢。
推荐答案
这是因为默认 StringComparer
是文化敏感。据我所知,的Comparer<串GT; .DEFAULT
为代表的 string.CompareTo(串)
它采用了目前文化:
It's because the default StringComparer
is culture-sensitive. As far as I can tell, Comparer<string>.Default
delegates to string.CompareTo(string)
which uses the current culture:
这方法执行字(区分大小写和文化敏感)使用当前区域性的比较。有关单词,字符串和序号排序的更多信息,请参阅的 System.Globalization.CompareOptions
。
This method performs a word (case-sensitive and culture-sensitive) comparison using the current culture. For more information about word, string, and ordinal sorts, see
System.Globalization.CompareOptions
.
那么对于 CompareOptions
包括:
.NET框架使用三个排序截然不同的方式:字排序,字符串排序,和序排序。单词排序执行字符串的文化敏感性比较。某些非字母数字字符可能分配给他们特殊的权重。例如,连字符( - ),使鸡舍和合作社在排序列表中相邻的出现可能具有分配给它的一个非常小的重量。串排序类似于字排序,所不同的是没有特殊的情况。因此,所有非字母数字符号来之前所有字母数字字符。序号排序基于字符串中的每个元素的Unicode值比较字符串。
The .NET Framework uses three distinct ways of sorting: word sort, string sort, and ordinal sort. Word sort performs a culture-sensitive comparison of strings. Certain nonalphanumeric characters might have special weights assigned to them. For example, the hyphen ("-") might have a very small weight assigned to it so that "coop" and "co-op" appear next to each other in a sorted list. String sort is similar to word sort, except that there are no special cases. Therefore, all nonalphanumeric symbols come before all alphanumeric characters. Ordinal sort compares strings based on the Unicode values of each element of the string.
(小重是不太一样为忽略在安德烈的回答引用,但是效果是相似的在这里。)
("Small weight" isn't quite the same as "ignored" as quoted in Andrei's answer, but the effects are similar here.)
如果您指定 StringComparer.Ordinal
,你得到的结果:
If you specify StringComparer.Ordinal
, you get results of:
66-0616280-00A
66-0616280-10A
66-0616280100A
66-0616280104A
其指定为第二个参数排序依据
:
s = s.OrderBy(t => t, StringComparer.Ordinal).ToList();
您可以在这里看到的区别:
You can see the difference here:
Console.WriteLine(Comparer<string>.Default.Compare
("66-0616280104A", "66-0616280-10A"));
Console.WriteLine(StringComparer.Ordinal.Compare
("66-0616280104A", "66-0616280-10A"));
这篇关于排序字母和破折号字符串时,意外的行为的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!