Powershell - 删除所有重复条目 [英] Powershell - Removing all duplicate entries

查看:60
本文介绍了Powershell - 删除所有重复条目的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试找到一个 Powershell 命令行,它将在文本文件中读取删除所有重复的行 (2+) 并且不保留任何重复的行.我无法在 Stackoverflow 或其他任何地方找到我的问题的答案.到目前为止,我发现的每个示例都只显示删除一条和/或多条重复行并保留一条.

I am trying to find a Powershell command line that will read in a text file remove all duplicated lines (2+) and retain none of the duplicated lines. I haven't been able to find an answer for my question anywhere on Stackoverflow nor anywhere else. Every example I have found so far only shows removing one and/or many of the duplicated lines and retaining one.

这可以通过 Powershell 2.0 实现吗?

Is this possible through Powershell 2.0?

PowerShell 命令示例:

Get-Content "C:\Temp\OriginalFile.txt" | select  -unique| Out-File "C:\Temp\ResultFile.txt"

OriginalFile.txt

1
1
1
2
2
3
4

ResultFile.txt(实际)

1
2
3
4

ResultsFile.txt(所需)

3
4

推荐答案

PSv2:

$f = 'C:\Temp\OriginalFile.txt'

Get-Content $f | Group-Object | ? { $_.Count -eq 1 } | Select-Object -ExpandProperty Name

PSv3+ 提供了更简洁、更简洁的解决方案:

PSv3+ allows for a cleaner and more concise solution:

Get-Content $f | Group-Object | ? Count -eq 1 | % Name

为简洁起见,命令使用内置别名 ?(对于 Where-Object)和 %(对于 ?>ForEach-Object).

For brevity, the commands use built-in aliases ? (for Where-Object) and % (for ForEach-Object).

Select-Object -UniqueGet-Unique 似乎都不允许将输出限制为输入中的单例(标准 Unix 实用程序 uniq 具有内置这样的功能:uniq -u),因此需要不同的方法.

Neither Select-Object -Unique nor Get-Unique seemingly allow restricting the output to singletons in the input (standard Unix utility uniq has such a feature built in: uniq -u), so a different approach is needed.

上面基于Group-Object的方案可能效率不高,但是很方便:

The above Group-Object-based solution may not be efficient, but it is convenient:

  • 行按相同的内容分组,产生代表每个组的对象.

  • lines are grouped by identical content, yielding objects that represent each group.

<代码>?{ $_.Count -eq 1 } 将组过滤到只有 1 个成员的组,实际上清除了所有重复的行.

? { $_.Count -eq 1 } the filters the groups down to those that have just 1 member, in effect weeding out all duplicate lines.

Select-Object -ExpandProperty Name 然后将过滤后的组对象转换回它们所代表的输入行.

Select-Object -ExpandProperty Name then transforms the filtered group objects back to the input line they represent.

这篇关于Powershell - 删除所有重复条目的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆