如何在多个CSV文件中比较,匹配和附加多个值? [英] How to compare, match, and append multiple values in multiple CSV files?

查看:208
本文介绍了如何在多个CSV文件中比较,匹配和附加多个值?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试找出执行此操作的最佳方法,而且我不确定如何通过同一管道使用两个不同的文件Import-Csv并导出找到的值...

I'm trying to figure out the best way to do this, and I'm not sure how to Import-Csv with 2 different files through the same pipeline and export a value found...

因此,让我们从CSV文件1开始:我只想要LoginNumber的值,其中Type = H and (ContractorDomain -ne $null -or ContractorDomain -ne "").例如,这只能从下面拉出00314822167312值.

So lets start with CSV file 1: I only want the values for LoginNumber where Type = H and (ContractorDomain -ne $null -or ContractorDomain -ne ""). For example, this should only pull values 0031482 and 2167312 from below.

注意:我仅添加了空格和箭头,以使此处的列更易于阅读. csv文件的列值或箭头之间没有空格.

"LoginNumber","Type","ContractorDomain"
"0031482"    ,"H"   ,"P12345"  <<
"1251632"    ,"P"   ,"A52671"
"2167312"    ,"H"   ,"425126"  <<
"0598217"    ,"L"   ,""
"1405735"    ,"H"   ,""
"2058194"    ,"A"   ,"L21514"

找到LoginNumber的值编号(基于上述条件)后,在CSV文件2中进行搜索.然后分别获取AccountStatusSamAccountName的值作为UserIDNumber的值.

When the value number for LoginNumber (based on conditions explained above) is found, search for it in CSV file 2. Then grab the value of AccountStatus and SamAccountName for the respective value of UserIDNumber.

"SamAccountName","UserIDNumber","AccountDescriptionDetails","AccountStatus"
"jd12395"       ,"0052142"     ,"Company CEO"              ,"Enabled"
"jwet"          ,"2167312"     ,"Software Developer"       ,"Disabled"  <<
"1b3gas5"       ,"1385293"     ,"Project Manager"          ,"Disabled"
"632g1fsa"      ,"0031482"     ,"QA Tester"                ,"Enabled"   <<
"4126hs"        ,"0000418"     ,"Program Manager"          ,"Disabled"
"axv"           ,"1840237"     ,"Accountant Administrator" ,"Disabled"

对于第三个CSV文件,我们具有以下内容:

For the 3rd CSV file we have the following:

"domainName","SameAccountName","DateExpired"
"TempDomain","jwet"           ,"20151230"    <<
"PermDomain","p21942"         ,""
"PermDomain","qz231034"       ,""
"TempDomain","632g1fsa"       ,"20151231"    <<
"TempDomain","ru20da2bb22"    ,"20160425"

接下来,对于第3个文件,我想添加该列以插入DisabledEnabled值(或User Match Not Found值):

Next, for the 3rd file, I want to add the column to plug in the Disabled and Enabled values (or User Match Not Found value):

"domainName","SameAccountName","DateExpired","UserStatus"
"TempDomain","jwet"           ,"20151230"   ,"Disabled"               <<
"PermDomain","p21942"         ,""           ,"User Match Not Found"
"PermDomain","qz231034"       ,""           ,"User Match Not Found"
"TempDomain","632g1fsa"       ,"20151231"   ,"Enabled"                <<
"TempDomain","ru20da2bb22"    ,"20160425"   ,"User Match Not Found"

我学习了如何导入csv并使用诸如此类的内容创建新列...

I learned how to import-csv and create new columns with something like this...

Import-Csv $file | Select-Object -Property *, @{Name="UserStatus";Expression={
  if ($true) {"fill value in here"}
}} | Export-Csv $newFile -NoType

所以我在想这样的事情.我只是不确定如何通过管道搜索/查找/传递多个CSV文件值.

So I'm thinking something like this. I'm just not sure how to search/find/pass multiple CSV files values through the pipeline.

注意:其中一些CSV文件在我们要搜索的列前后有15列.另外,某些列的值带有逗号,因此我不能真正依靠-Delimiter ,.另外,某些列值没有"(如果要以txt格式打开CSV).

Note: some of these CSV files have like 15 columns before and after the columns we are searching for. Also, some of the columns values have a comma, so I can't really rely on the -Delimiter ,. Also, some of the column values do not have " (if you were to open the CSV in txt format).

推荐答案

如果正确引用了值(即CSV有效),则包含逗号的列就不会成为问题. Import-Csv将正确导入记录42,"a,b",c作为三个值42a,bc.如果您的CSV格式不正确:请先进行修复.

Columns containing commas shouldn't be an issue if the values are properly quoted (i.e. if the CSV is valid). Import-Csv will correctly import a record 42,"a,b",c as three values 42, a,b and c. If your CSV isn't well-formed: fix that first.

从第一个CSV文件中获取登录ID:

Fetch the login IDs from the first CSV file:

$logins = Import-Csv 'C:\path\to\file1.csv' |
          Where-Object { $_.Type -eq 'H' -and $_.ContractorDomain } |
          Select-Object -Expand LoginNumber

您可以将ContractorDomain属性检查简化为$_.ContractorDomain,因为PowerShell会同时解释一个空字符串和$null

You can simplify the ContractorDomain property check to just $_.ContractorDomain, because PowerShell interprets both an empty string and $null as a boolean value $false in that context. The same would happen for other zero or empty values (0, 0.0, empty array, etc.), but that shouldn't be an issue in your scenario.

接下来创建一个哈希表,将帐户名称映射到其各自的状态.通过之前创建的ID列表过滤导入的第二个CSV,因此哈希表仅包含相关映射.

Next create a hashtable mapping account names to their respective status. Filter the imported second CSV by the list of IDs you created before, so the hashtable contains only relevant mappings.

$accountStatus = @{}
Import-Csv 'C:\path\to\file2.csv' | Where-Object {
  $logins -contains $_.UserIDNumber
} | ForEach-Object {
  $accountStatus[$_.SamAccountName] = $_.AccountStatus
}

使用该哈希表,您现在可以将UserStatus列添加到您的第三个CSV中:

With that hashtable you can now add the UserStatus column to your third CSV:

(Import-Csv 'C:\path\to\file3.csv') |
  Select-Object -Property *, @{n='UserStatus';e={
    if ($accountStatus.ContainsKey($_.SameAccountName)) {
      $accountStatus[$_.SameAccountName]
    } else {
      'User Match Not Found'
    }
  }} | Export-Csv 'C:\path\to\file3.csv' -NoType

Import-Csv语句周围的括号可确保在Export-Csv开始对其进行写入之前,已完全读取并关闭该文件.仅当您将修改后的数据写回到同一文件时才需要它们,否则可以省略.星号选择所有导入的列,另外的计算的属性添加新的您想要包含的列.

The parentheses around the Import-Csv statement ensure that the file is completely read and closed before Export-Csv starts writing to it. They're only required if you're writing the modified data back to the same file and can be omitted otherwise. The asterisk selects all imported columns, and the additional calculated property adds the new column you want to include.

这篇关于如何在多个CSV文件中比较,匹配和附加多个值?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆