如何合并记录与 1 列匹配的两个 csv 文件中的所有内容 [英] How to merge all contents in two csv files where records match off 1 column

查看:51
本文介绍了如何合并记录与 1 列匹配的两个 csv 文件中的所有内容的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有两个 csv 文件.它们都有共同的 SamAccountName.用户记录可能会或可能不会为两个文件之间的每条记录找到匹配项(注意这一点非常重要).

I have two csv files. They both have SamAccountName in common. User records may or may not have a match found for every record between both files (THIS IS VERY IMPORTANT TO NOTE).

我试图基本上只是将所有列(及其值)合并到一个文件中(基于在第一个文件中找到的 SamAccountNames...).

I am trying to basically just merge all columns (and their values) into one file (based from the SamAccountNames found in the first file...).

如果在第二个文件中没有找到 SamAccountName,它应该在合并文件中为该用户记录添加所有空值(因为记录是在第一个文件中找到的).

If the SamAccountName is not found in the 2nd file, it should add all null values for that user record in the merged file (since the record was found in the first file).

如果 SamAccountName 在第二个文件中找到,但不在第一个文件中,则应忽略合并该记录.

If the SamAccountName is found in the 2nd file, but not in the first, it should ignore merging that record.

每个文件中的列数可能会有所不同(5、10、2 等......).

Number of columns in each file may vary (5, 10, 2, so forth...).

Function MergeTwoCsvFiles
{
    Param ([String]$baseFile, [String]$fileToBeMerged, [String]$columnTitleLineInFileToBeMerged)
    
    $baseFileCsvContents = Import-Csv $baseFile
    $fileToBeMergedCsvContents = Import-Csv $fileToBeMerged
    
    $baseFileContents = Get-Content $baseFile
    
    $baseFileContents[0] += "," + $columnTitleLineInFileToBeMerged
    
    $baseFileCsvContents | ForEach-Object {
        $matchFound = $False
        $baseSameAccountName = $_.SamAccountName
        [String]$mergedLineInFile = $_
        
        [String]$lineMatchFound = $fileToBeMergedCsvContents | Where-Object {$_.SamAccountName -eq $baseSameAccountName}
        Write-Host '$mergedLineInFile =' $mergedLineInFile
        Write-Host '$lineMatchFound =' $lineMatchFound
        Exit
    }
}

问题是,文件中的记录被写入为哈希表,而不是像行这样的字符串(如果您将其视为 .txt).所以我不太确定该怎么做...

The problem is, the record in the file is being written as a hash table instead of a string like line (if you were to view it as .txt). So I'm not really sure how to do this...

第一个 CSV 文件

"SamAccountName","sn","GivenName"
"PBrain","Pinky","Brain"
"JSteward","John","Steward"
"JDoe","John","Doe"
"SDoo","Scooby","Doo"

第二个 CSV 文件

"SamAccountName","employeeNumber","userAccountControl","mail"
"KYasunori","678213","546","KYasunori@mystuff.com"
"JSteward","43518790","512","JSteward@mystuff.com"
"JKibogabi","24356","546","JKibogabi@mystuff.com"
"JDoe","902187u4","1114624","JDoe@mystuff.com"
"CStrife","54627","512","CStrife@mystuff.com"

预期的合并 CSV 文件

Expected Merged CSV File

"SamAccountName","sn","GivenName","employeeNumber","userAccountControl","mail"
"PBrain","Pinky","Brain","","",""
"JSteward","John","Steward","43518790","512","JSteward@mystuff.com"
"JDoe","John","Doe","902187u4","1114624","JDoe@mystuff.com"
"SDoo","Scooby","Doo","","",""

注意:这将是合并多个文件的循环过程的一部分,因此我想避免对标题名称进行硬编码($_.SamAccountName 作为例外)

Note: This will be part of a loop process in merging multiple files, so I would like to avoid hardcoding the title names (with $_.SamAccountName as an exception)

$baseFileCsvContents = Import-Csv 'D:\Scripts\Powershell\Tests\base.csv'
$fileToBeMergedCsvContents = Import-Csv 'D:\Scripts\Powershell\Tests\lookup.csv'
$resultsFile = 'D:\Scripts\Powershell\Tests\MergedResults.csv'
$resultsFileContents = @()

$baseFileContents = Get-Content 'D:\Scripts\Powershell\Tests\base.csv'

$recordsMatched = compare-object $baseFileCsvContents $fileToBeMergedCsvContents -Property SamAccountName

switch ($recordsMatched)
{
    '<=' {}
    '=>' {}
    '==' {$resultsFileContents += $_}
}

$resultsFileCsv = $resultsFileContents | ConvertTo-Csv
$resultsFileCsv | Export-Csv $resultsFile -NoTypeInformation -Force

输出给出一个空白文件:(

Output gives a blank file :(

推荐答案

下面的代码根据您提供的输入输出所需的结果.

The code below outputs the desired results based on the inputs you provided.

function CombineSkip1($s1, $s2){
    $s3 = $s1 -split ',' 
    $s2 -split ',' | select -Skip 1 | % {$s3 += $_}
    $s4 = $s3 -join ', '

    $s4
}

Write-Output "------Combine files------"

# content
$c1 = Get-Content D:\junk\test1.csv
$c2 = Get-Content D:\junk\test2.csv

# users in both files, could be a better way to do this
$t1 = $c1 | ConvertFrom-Csv
$t2 = $c2 | ConvertFrom-Csv
$users = $t1 | Select SamAccountName

# generate final, combined output
$combined = @()
$combined += CombineSkip1 $c1[0] $c2[0]

$c2PropCount = ($c2[0] -split ',').Count - 1
$filler = (', ""' * $c2PropCount)

for ($i = 1; $i -lt $c1.Count; $i++){
    $user = $c1[$i].Split(',')[0]
    $u2 = $c2 | where {([string]$_).StartsWith($user)}
    if ($u2)
    {
        $combined += CombineSkip1 $c1[$i] $u2
    }
    else
    {
        $combined += ($c1[$i] + $filler)
    }
}

# write to output and file
Write-Output $combined
$combined | Set-Content -Path D:\junk\test3.csv -Force

这篇关于如何合并记录与 1 列匹配的两个 csv 文件中的所有内容的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆