Powershell:如何将唯一的标头从一个CSV合并到另一个? [英] Powershell: How to merge unique headers from one CSV to another?

查看:88
本文介绍了Powershell:如何将唯一的标头从一个CSV合并到另一个?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

因此,我弄清楚了如何获取CSV 2中的唯一标头以附加到CSV 1.

So I've figure out how to get the unique headers in CSV 2 to append to CSV 1.

$header = ($table | Get-Member -MemberType NoteProperty).Name
$header_add = ($table_add | Get-Member -MemberType NoteProperty).Name
$header_diff = $header + $header_add
$header_diff = ($header_diff | Sort-Object -Unique)
$header_diff = (Compare-Object -ReferenceObject $header -DifferenceObject $header_diff -PassThru)

$ header是CSV 1($ table)中的标头数组. $ header_add是CSV 2($ table_add)中的标头数组. $ header_diff在代码块末尾包含CSV 2中的唯一标头.

$header is an array of headers from CSV 1 ($table). $header_add is an array of headers from CSV 2 ($table_add). $header_diff houses the unique headers in CSV 2 by the end of the code block.

据我所知,下一步是:

$append = ($table_add | Select-Object $header_diff)

我现在的问题是如何将这些对象附加到CSV 1($ table 1)对象中?我还没有看到Add-Member以一种特别不错的方式来做到这一点的方法.

My problem now is how do I append these objects to my CSV 1 ($table 1) object? I don't quite see a way for Add-Member to do this in a particularly nice fashion.

原文:

这是我要合并的两个CSV文件的标题.

Here's the headers for the two CSV files I'm trying to combine.

CSV 1:

Date, Name, Assigned Router, City, Country, # of Calls  , Calls in  , Calls out

CSV 2:

Date, Name, Assigned Router, City, Country, # of Minutes, Minutes in, Minutes out

因此,可以快速了解这些文件的含义;这两个文件都包含一天的一组名称的呼叫信息(date列的每一行都有相同的日期;这是因为最终将其发送到包含所有日期的主.xlsx文件中).直到国家/地区"之前的所有列在两个文件中都以相同的顺序包含相同的值.这些文件仅将呼叫次数和分钟数据分开.我想知道是否有一种方便的方法将不同的列从一个CSV移到另一个CSV.

So a quick rundown of what these files are; both files contain call information for a set of names for one day (the date column has the same date for each row; this is because this eventually gets sent to a master .xlsx file with all dates combined). All of the columns up to Country contain the same values in the same order in both files. The files simply separate the # of calls and # of minutes data. I was wondering if there was a convenient way to move the unlike columns from one CSV to another.

我已经尝试使用以下方式:

I've tried using something along the lines of:

Import-Csv (Get-ChildItem <directory> -Include <common pattern in file pair>) | Export-Csv <output path> -NoTypeInformation

这并没有合并所有匹配的标头,之后又添加了唯一的标头.仅处理的第一个文件保留其唯一的标头.处理的第二个文件将所有这些标头和数据丢弃在输出中.在第二个CSV中共享的标题数据已添加为附加行.

This didn't combine all of the matching headers and append the unique ones afterwards. Only the first file that's processed kept its unique headers. The second file that was processed had all of those headers and data discarded in the output. Shared header data in the second CSV was added as additional rows.

我描述的失败输出的示例输出:

An example output of my described fail output:

PS > $small | Format-Table

Column_1 Column_2 Column_3
-------- -------- --------
1        a        a
1        b        b
1        c        c


PS > $small_add | Format-Table

Column_1 Column_4 Column_5
-------- -------- --------
1        x        x
1        y        y
1        z        z


PS > Import-Csv (Get-ChildItem ./*.* -Include "small*.csv") | Select-Object * -unique | Format-Table

Column_1 Column_2 Column_3
-------- -------- --------
1        a        a
1        b        b
1        c        c
1
1
1

我想知道我是否可以执行以下算法:

I was wondering if I could do something like the following algorithm:

  1. 导入Csv CSV_1和CSV_2以分隔变量

  1. Import-Csv CSV_1 and CSV_2 to separate variables

将CSV_2标头与CSV_1标头进行比较,将CSV_2中不同的标头存储到单独的变量中

Compare CSV_2 headers to CSV_1 headers, storing the unlike headers in CSV_2 into a separate variable

选择对象所有CSV_1标头,与CSV_2标头不同

Select-Object all CSV_1 headers and unlike CSV_2 headers

将选择对象输出管道到Export-Csv

Pipe the Select-Object output to Export-Csv

我唯一想到的另一种方法是逐行进行:

The only other method I could only think of is doing it line by line where I would:

  1. 同时导入Csv

  1. Import-Csv both

从CSV_2中删除所有共享列

remove all of the shared columns from CSV_2

将其从Powershell用于CSV的自定义对象更改为字符串

change it from the custom object Powershell uses for CSVs to a string

将CSV_2的每一行追加到CSV_1的每一行

append each line of CSV_2 to each line of CSV_1

感觉有点不完善和不灵活(可以通过隔离列/标题来解决灵活性,因此添加字符串没有问题).

It feels a bit unrefined and inflexible (flexibility can probably be dealt with by how columns/headers are isolated so there's no problem appending strings).

推荐答案

*此答案侧重于高级抽象OO 解决方案.
* OP自己的解决方案更依赖于字符串处理,它有可能变得更快.

* This answer focuses on a high-level-of-abstraction OO solution.
* The OP's own solution relies more on string processing, which has the potential to be faster.

# The input file paths.
$files = 'csv1.csv', 'csv2.csv'
$outFile = 'csvMerged.csv'

# Read the 2 CSV files into collections of custom objects.
# Note: This reads the entire files into memory.
$doc1 = Import-Csv $files[0]
$doc2 = Import-Csv $files[1]

# Determine the column (property) names that are unique to document 2.
$doc2OnlyColNames = (
  Compare-Object $doc1[0].psobject.properties.name $doc2[0].psobject.properties.name |
    Where-Object SideIndicator -eq '=>'
).InputObject

# Initialize an ordered hashtable that will be used to temporarily store
# each document 2 row's unique values as key-value pairs, so that they
# can be appended as properties to each document-1 row.
$htUniqueRowD2Props = [ordered] @{}

# Process the corresponding rows one by one, construct a merged output object
# for each, and export the merged objects to a new CSV file.
$i = 0
$(foreach($rowD1 in $doc1) {
  # Get the corresponding row from document 2.
  $rowD2 = $doc2[$i++]
  # Extract the values from the unique document-2 columns and store them in the ordered
  # hashtable.
  foreach($pname in $doc2OnlyColNames) { $htUniqueRowD2Props.$pname = $rowD2.$pname }
  # Add the properties represented by the hashtable entries to the
  # document-1 row at hand and output the augmented object (-PassThru).
  $rowD1 | Add-Member -NotePropertyMembers $htUniqueRowD2Props -PassThru
}) | Export-Csv -NoTypeInformation -Encoding Utf8 $outFile


要对上述内容进行测试,可以使用以下示例输入:


To put the above to the test, you can use the following sample input:

# Create sample input CSV files
@'
Date,Name,Assigned Router,City,Country,# of Calls,Calls in,Calls out
dt,nm,ar,ct,cy,cc,ci,co
dt2,nm2,ar2,ct2,cy2,cc2,ci2,co2
'@ > csv1.csv

# Same column layout and data as above through column 'Country', then different.
@'
Date,Name,Assigned Router,City,Country,# of Minutes,Minutes in,Minutes out
dt,nm,ar,ct,cy,mc,mi,mo
dt2,nm2,ar2,ct2,cy2,mc2,mi2,mo2
'@ > csv2.csv

代码应在csvMerged.csv中产生以下内容:

The code should produce the following content in csvMerged.csv:

"Date","Name","Assigned Router","City","Country","# of Calls","Calls in","Calls out","# of Minutes","Minutes in","Minutes out"
"dt","nm","ar","ct","cy","cc","ci","co","mc","mi","mo"
"dt2","nm2","ar2","ct2","cy2","cc2","ci2","co2","mc2","mi2","mo2"

这篇关于Powershell:如何将唯一的标头从一个CSV合并到另一个?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆