按部分名称匹配过滤文件 [英] Filtering files by partial name match

查看:43
本文介绍了按部分名称匹配过滤文件的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个包含 20.000 个格式的 XML 文件的网络共享

I have a network share with 20.000 XML files in the format

username-computername.xml

有重复的条目(当用户收到新电脑时)

There are duplicate entries in the form of (when a user received a new comptuer)

user1-computer1.xml
user1-computer2.xml

user1-computer1.xml
user1-computer2.xml

BLRPPR-SKB52084.xml
BLRSIA-SKB50871.xml
S028DS-SKB51334.xml
s028ds-SKB52424.xml
S02FL6-SKB51644.xml
S02FL6-SKB52197.xml
S02VUD-SKB52083.xml

BLRPPR-SKB52084.xml
BLRSIA-SKB50871.xml
S028DS-SKB51334.xml
s028ds-SKB52424.xml
S02FL6-SKB51644.xml
S02FL6-SKB52197.xml
S02VUD-SKB52083.xml

因为我稍后要操作 XML,所以我不能仅仅忽略数组的属性,因为至少我需要完整路径.目的是,如果发现重复,则使用时间戳较新的那个.

Since im going to manipulate the XMLs later I can't just dismiss properties of the array as at the very least I need the full path. The aim is, if a duplicate is found, the one with the newer timestamp is being used.

这是我需要该逻辑的代码片段

Here is a snipet of the code where I need that logic

$xmlfiles = Get-ChildItem "network share"

这里我只是在做一个 foreach 循环:

Here I'm just doing a foreach loop:

foreach ($xmlfile in $xmlfiles) {
  [xml]$xmlcontent = Get-Content -Path $xmlfile.FullName -Encoding UTF8
  Select-Xml -Xml $xmlcontent -Xpath "  "
  # create [pscustomobject] etc...
}

基本上我需要的是

if ($xmlfiles.Name.Split("-")[0]) - duplicate) {
  # select the one with higher $xmlfiles.LastWriteTime and store either
  # the full object or the $xmlfiles.FullName
}

理想情况下,这应该是 foreach 循环的一部分,不必循环两次.

Ideally that should be part of the foreach loop to not to have to loop through twice.

推荐答案

您可以使用 Group-Object 按自定义属性对文件进行分组:

You can use Group-Object to group files by a custom attribute:

$xmlfiles | Group-Object { $_.Name.Split('-')[0] }

上面的语句会产生这样的结果:

The above statement will produce a result like this:

Count Name    Group
----- ----    -----
    1 BLRPPR  {BLRPPR-SKB52084.xml}
    1 BLRSIA  {BLRSIA-SKB50871.xml}
    2 S028DS  {S028DS-SKB51334.xml, s028ds-SKB52424.xml}
    2 S02FL6  {S02FL6-SKB51644.xml, S02FL6-SKB52197.xml}
    1 S02VUD  {S02VUD-SKB52083.xml}

其中 Group 属性包含原始 FileInfo 对象.

where the Group property contains the original FileInfo objects.

ForEach-Object 循环中展开组,按 LastWriteTime 对每个组进行排序,然后从中选择最近的文件:

Expand the groups in a ForEach-Object loop, sort each group by LastWriteTime, and select the most recent file from it:

... | ForEach-Object {
  $_.Group | Sort-Object LastWriteTime -Desc | Select-Object -First 1
}

这篇关于按部分名称匹配过滤文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆