合并CSV-不加入 [英] Merge csv's - no join
问题描述
我需要组合许多Excel电子表格.我使用PowerSHell将它们转换为CSV,现在需要合并它们,但并不像通常那样.合并不使用联接.如果我有3个文件,每个文件100行,那么我的新文件应该有300行.因此,与使用数据库术语的JOIN相比,这更像是UNION.
I need to combine a slew of Excel spreadsheets. I used PowerSHell to convert them to CSVs and now need to merge them, but not as you typically would. The merge doesn't use a join. If I have 3 files with 100 rows each, my new file should have 300 rows. So, this is more if a UNION than a JOIN to use database terms.
某些列确实具有相同的名称.有些没有.如果它们具有相同的名称,则不应创建新列.有没有一种方法,而不必手动列出所有列作为属性?
Some of the columns do have the same name. Some don't. If they have the same name, a new column shouldn't be created. Is there a way to do this without manually having to list out all the columns as properties?
示例(只有2个文件)
文件1:
Name Address
Bob 123 Main
文件2:
Name City
Bob LA
Tom Boston
结果
Name Address City
Bob 123 Main
Bob LA
Tom Boston
推荐答案
在一天结束时,可能无法正确排序.这里的技巧是读取每个文件的标头,并将其收集为字符串数组,然后删除和删除重复项.
At the end of the day this might not be sorted right. The trick here is to read the header of each file and collect it as a string array and remove and of the duplicates.
此代码假定所有文件都在同一位置.如果不是这样,您将需要考虑这一点.
This code assumes all the files are in the same location. If not you will need to account for that.
$files = Get-ChildItem -Path 'C:\temp\csv\' -Filter '*.csv' | Select-Object -ExpandProperty FullName
# Gather the headers for all the files.
$headers = $files | ForEach-Object{
(Get-Content $_ -Head 1).Split(",") | ForEach-Object{$_.Trim()}
} | Sort-Object -Unique
# Loop again now and read in the csv files as objects
$files | ForEach-Object{
Import-Csv $_
} | Select-Object $headers
输出看起来像这样:
Address City Name
------- ---- ----
123 Main Bob
LA Bob
Boston Tom
这篇关于合并CSV-不加入的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!