Powershell带有下划线的字符串排序 [英] Powershell Sort of Strings with Underscores

查看:298
本文介绍了Powershell带有下划线的字符串排序的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

以下列表排序不正确(IMHO):

The following list does not sort properly (IMHO):

$a = @( 'ABCZ', 'ABC_', 'ABCA' )
$a | sort
ABC_
ABCA
ABCZ

我方便的ASCII图表和Unicode C0控件以及基本拉丁图表 下划线(下划线)的序号为95(U + 005F).此数字比大写字母A-Z高.排序应该把以下划线结尾的字符串放在最后.

My handy ASCII chart and Unicode C0 Controls and Basic Latin chart have the underscore (low line) with an ordinal of 95 (U+005F). This is a higher number than the capital letters A-Z. Sort should have put the string ending with an underscore last.

Get-Culture是美国

Get-Culture is en-US

下一组命令符合我的期望:

The next set of commands does what I expect:

$a = @( 'ABCZ', 'ABC_', 'ABCA' )
[System.Collections.ArrayList] $al = $a
$al.Sort( [System.StringComparer]::Ordinal )
$al
ABCA
ABCZ
ABC_

现在,我创建一个包含这三个字符串的ANSI编码文件:

Now I create an ANSI encoded file containing those same 3 strings:

Get-Content -Encoding Byte data.txt
65 66 67 90 13 10  65 66 67 95 13 10  65 66 67 65 13 10
$a = Get-Content data.txt
[System.Collections.ArrayList] $al = $a
$al.Sort( [System.StringComparer]::Ordinal )
$al
ABC_
ABCA
ABCZ

再一次,包含下划线/下划线的字符串将无法正确排序.我想念什么?

Once more the string containing the underscore/lowline is not sorted correctly. What am I missing?

让我们参考示例4:

'A' -lt '_'
False
[char] 'A' -lt [char] '_'
True

似乎两个语句都应为False或两者都应为True.我在第一个语句中比较字符串,然后比较Char类型.字符串只是Char类型的集合,因此我认为这两个比较操作应该等效.

Seems like both statements should be False or both should be True. I'm comparing strings in the first statement, and then comparing the Char type. A string is merely a collection of Char types so I think the two comparison operations should be equivalent.

现在以#5为例:

Get-Content -Encoding Byte data.txt
65 66 67 90 13 10  65 66 67 95 13 10  65 66 67 65 13 10
$a = Get-Content data.txt
$b = @( 'ABCZ', 'ABC_', 'ABCA' )
$a[0] -eq $b[0]; $a[1] -eq $b[1]; $a[2] -eq $b[2];
True
True
True
[System.Collections.ArrayList] $al = $a
[System.Collections.ArrayList] $bl = $b
$al[0] -eq $bl[0]; $al[1] -eq $bl[1]; $al[2] -eq $bl[2];
True
True
True
$al.Sort( [System.StringComparer]::Ordinal )
$bl.Sort( [System.StringComparer]::Ordinal )
$al
ABC_
ABCA
ABCZ
$bl
ABCA
ABCZ
ABC_

两个ArrayList包含相同的字符串,但排序方式不同.为什么?

The two ArrayList contain the same strings, but are sorted differently. Why?

推荐答案

在许多情况下,PowerShell将对象包装在PSObject中或从中包装.在大多数情况下,它是透明地完成的,您甚至没有注意到,但这是造成您麻烦的原因.

In many cases PowerShell wrap/unwrap objects in/from PSObject. In most cases it is done transparently, and you does not even notice this, but in your case it is what cause your trouble.

$a='ABCZ', 'ABC_', 'ABCA'
$a|Set-Content data.txt
$b=Get-Content data.txt

[Type]::GetTypeArray($a).FullName
# System.String
# System.String
# System.String
[Type]::GetTypeArray($b).FullName
# System.Management.Automation.PSObject
# System.Management.Automation.PSObject
# System.Management.Automation.PSObject

如您所见,从Get-Content返回的对象包装在PSObject中,这防止了StringComparer看到底层字符串并正确比较它们.强类型字符串收集不能存储PSObject,因此PowerShell将解开字符串以将它们存储在强类型集合中,从而允许StringComparer查看字符串并正确比较它们.

As you can see, object returned from Get-Content are wrapped in PSObject, that prevent StringComparer from seeing underlying strings and compare them properly. Strongly typed string collecting can not store PSObjects, so PowerShell will unwrap strings to store them in strongly typed collection, that allows StringComparer to see strings and compare them properly.

首先,当您编写$a[1].GetType()$b[1].GetType()时,您不会调用.NET方法,而是调用PowerShell方法,该方法通常在包装对象上调用.NET方法.因此,您无法通过这种方式获取对象的真实类型.甚至可以覆盖它们,请考虑以下代码:

First of all, when you write that $a[1].GetType() or that $b[1].GetType() you does not call .NET methods, but PowerShell methods, which normally call .NET methods on wrapped object. Thus you can not get real type of objects this way. Even more, them can be overridden, consider this code:

$c='String'|Add-Member -Type ScriptMethod -Name GetType -Value {[int]} -Force -PassThru
$c.GetType().FullName
# System.Int32

让我们通过反射调用.NET方法:

Let us call .NET methods thru reflection:

$GetType=[Object].GetMethod('GetType')
$GetType.Invoke($c,$null).FullName
# System.String
$GetType.Invoke($a[1],$null).FullName
# System.String
$GetType.Invoke($b[1],$null).FullName
# System.String

现在我们得到$c的实型,但是它说$b[1]的类型是String而不是PSObject.就像我说的那样,在大多数情况下,展开是透明进行的,因此您看到包裹的String而不是包裹的PSObject本身.一种不发生的特殊情况是:当您传递数组时,数组元素不会被解包.因此,让我们在此处添加更多级别的间接寻址:

Now we get real type for $c, but it says that type of $b[1] is String not PSObject. As I say, in most cases unwrapping done transparently, so you see wrapped String and not PSObject itself. One particular case when it does not happening is that: when you pass array, then array elements are not unwrapped. So, let us add additional level of indirection here:

$Invoke=[Reflection.MethodInfo].GetMethod('Invoke',[Type[]]([Object],[Object[]]))
$Invoke.Invoke($GetType,($a[1],$null)).FullName
# System.String
$Invoke.Invoke($GetType,($b[1],$null)).FullName
# System.Management.Automation.PSObject

现在,当我们将$b[1]作为数组的一部分传递时,我们可以看到它的真实类型:PSObject.虽然,我更喜欢使用[Type]::GetTypeArray.

Now, as we pass $b[1] as part of array, we can see real type of it: PSObject. Although, I prefer to use [Type]::GetTypeArray instead.

关于StringComparer:如您所见,如果两个比较对象都不都是字符串,则StringComparer依赖IComparable.CompareTo进行比较.并且 PSObject 实现IComparable接口,以便排序将根据PSObject IComparable实现.

About StringComparer: as you can see, when not both compared objects are strings, then StringComparer rely on IComparable.CompareTo for comparison. And PSObject implement IComparable interface, so that sorting will be done according to PSObject IComparable implementation.

这篇关于Powershell带有下划线的字符串排序的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆