如何使用Powershell从多个csv文件中提取一个特定的列(无标题,说第2列)? [英] How to extract one specific column (no header, say column 2) from multiple csv files using Powershell?

查看:2276
本文介绍了如何使用Powershell从多个csv文件中提取一个特定的列(无标题,说第2列)?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有23个csv文件。每个文件包含两列。我只想要第二列为输出(输出可以是csv或xlsx文件;文件的名称= Banks_2008_2014.some扩展名)。我想获得第一个文件的第二列,并将其放入输出文件的第一列。

  $ excel_test = New-Object -ComObject Excel.Application 
$ excel_test.visible = $ true
$ excel_test.DisplayAlerts = $ true

$ excel_test | select-object | format-table -autosize

$ excelFiles = Get-ChildItem -Path C:\SNOWPACK\ Samarth\1_Banks\AllPro_Banks_Point1 -Include * .csv -Recurse

Freach($ i in $ excelFiles)
{
$ excel_test.workbook.worksheet.column.item $ i)= $ i [1]
}

错误: System.IO).File.Info



示例:





  Column_1 Column_2 
0.1 0.11
0.2 0.45
0.35 0.6
0.25 0.8
0.33 0.1

File_2

  Column_1 Column_2 
0.9 0.2
0.2 0.11
0.45 0.4
0.34 0.6

结果应如下所示

  Column_1 Column_2 
0.11 0.2
0.45 0.11
0.6 0.4
0.8 0.6
0.1


 =lang-powershell prettyprint-override>  $ path =C:\ SNOWPACK\Samarth\1_Banks\AllPro_Banks_Point1\ * .csv
Get-Content $ path | ForEach-Object {$ _。Split(,)[1]} | set-content C:\SNOWPACK\Samarth\1_Banks\AllPro_Banks_Point1\Banks_2008_2014.csv

这将从csv文件集合中的每一行提取第二个项目。这当然是假设你的csv文件形成良好。



猜测您的错误是来自(get-content $ File。),因为解析器将会看到您尝试访问 $ file 的属性,其中没有指定。



问题



看起来你对原来的问题不太清楚。将列添加在一起是一个不同的球类游戏,但它可以做到。

  $ inputPath =C:\SNOWPACK\Samarth\1_Banks\AllPro_Banks_Point1 
#创建所有文件的多维数组。
$ allFiles = @()
Get-ChildItem -Path $ inputPath -Include* .csv-Recurse | ForEach-Object {$ allFiles + =,@($ _ | Get-Content | ForEach-Object {$ _。Split(,)[1]})}
Write-HostCollected $ .Count)files-ForegroundColor Green

#确定最长行的长度
$ maxRows = $ allFiles | ForEach-Object {$ _。Count} |测量对象 - 最大|选择对象-ExpandProperty最大
写主机最高行计数:$ maxRows-ForegroundColor绿

#下一行将清除文件。取消注释,如果这是你正在寻找。
#Clear-Content c:\temp\\\
ewfile.csv

For($ rowIndex = 0; $ rowIndex -lt $ maxRows; $ rowIndex ++){
#Build每行
$ row = @()
For($ fileIndex = 0; $ fileIndex -lt $ allFiles.Count; $ fileIndex ++){
#构建一个所有元素的数组此行中的每个文件
$ row + = $ allFiles [$ fileIndex] [$ rowIndex]
}
#使用-join和ouput文件创建适当的delimeted行。
$ row -join,|添加内容c:\temp\\\
ewfile.csv
}



编辑2.0 $

b

修正了输出的工作方式。这可能不是最有效的方式,但如果你的文件是小的,它会工作很好。为每一行调用 Add-Content 。注意注释掉的 Clear-Content


I have 23 csv files. Each file contains two columns. I only want second column to be an output (output can be either csv or xlsx file; Name of the file = Banks_2008_2014.some extension). I want to get the second column of the first file and place it into the first column of the output file.

$excel_test = New-Object -ComObject Excel.Application
$excel_test.visible =$true
$excel_test.DisplayAlerts =$true

$excel_test|select-object |format-table -autosize

$excelFiles = Get-ChildItem -Path     C:\SNOWPACK\Samarth\1_Banks\AllPro_Banks_Point1 -Include *.csv -Recurse

Freach ($i in $excelFiles)
{
$excel_test.workbook.worksheet.column.item($i) = $i[1]
}

Error: Unable to index into (System.IO).File.Info

Examples:

File_1

Column_1 Column_2
  0.1      0.11
  0.2      0.45
  0.35     0.6
  0.25     0.8
  0.33     0.1

File_2

Column_1 Column_2
 0.9       0.2   
 0.2       0.11
 0.45      0.4
 0.34      0.6

Result should look like this

Column_1 Column_2
  0.11     0.2
  0.45     0.11
  0.6      0.4
  0.8      0.6
  0.1

解决方案

So you are getting just the second columns from a collection of csv files?

$path = "C:\SNOWPACK\Samarth\1_Banks\AllPro_Banks_Point1\*.csv"
Get-Content $path | ForEach-Object{$_.Split(",")[1]} | set-content C:\SNOWPACK\Samarth\1_Banks\AllPro_Banks_Point1\Banks_2008_2014.csv

That will extract the second item from every line in the csv file collection. This is of course assuming your csv files are formed well. No commas inside quotes.

Guessing your error was coming from (get-content $File.) since the parser would have seen you trying to access a propery of $file wherein none was specified.

Update from Question

It would seem you weren't as clear with original question. Adding the columns together is a different ball game but it can be done.

$inputPath = "C:\SNOWPACK\Samarth\1_Banks\AllPro_Banks_Point1"
# Create a multidimensional array of all the files. 
$allFiles = @()
Get-ChildItem -Path $inputPath -Include "*.csv" -Recurse | ForEach-Object{$allFiles += ,@($_ | Get-Content | ForEach-Object{$_.Split(",")[1]})}
Write-Host "Collected $($allFiles.Count) files" -ForegroundColor Green

# Determine the length of the longest row
$maxRows = $allFiles | ForEach-Object{$_.Count} | Measure-Object -Maximum | Select-Object -ExpandProperty Maximum
Write-Host "Highest Row Count: $maxRows"  -ForegroundColor Green

# Next line will clear the file. Uncomment it if that is what you are looking for.
#Clear-Content c:\temp\newfile.csv

For($rowIndex = 0; $rowIndex -lt $maxRows; $rowIndex++ ){
    # Build each row individually
    $row = @()
    For($fileIndex = 0; $fileIndex -lt $allFiles.Count; $fileIndex++ ){
        # Build an array of all the elements from each file in this row
        $row += $allFiles[$fileIndex][$rowIndex]
    }
    # Create proper delimeted row using -join and ouput to file.
    $row -join "," | Add-Content c:\temp\newfile.csv
} 

This should also work if the files are of variable length and if some of the rows contain empty entries.

Edit for 2.0

Fixed how the output was working. This might not be the most efficient way but if your files are small it will work just fine. Call Add-Content for each row. Notice the Clear-Content that is commented out.

这篇关于如何使用Powershell从多个csv文件中提取一个特定的列(无标题,说第2列)?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆