在不使用Excel API的情况下计算CSV列 [英] Count CSV columns without using Excel API

查看:95
本文介绍了在不使用Excel API的情况下计算CSV列的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

是否有任何方法/代码/引用可以通过使用PowerShell Get-Content且不使用任何Excel API来计算CSV文件中每行记录的列数,因为CSV文件中的记录非常大(1GB以上).使用Import-CsvNew-Object -ComObject Excel.Application之类的Excel API会花费非常长的时间.

Is there any method/code/reference to count the number of columns of every each row of records in a CSV file by maybe using PowerShell Get-Content and without using any Excel API as the records in the CSV file are extremely large (1GB+). Using Excel API such as Import-Csv or New-Object -ComObject Excel.Application will take extremely long periods of time to process.

下图是CSV文件中记录的示例.

The following image is the example of records in a CSV file.

并且以下图像是在记事本++中打开的相同CSV.如您所见,第4行中只有一个定界符.因此,该行计为2列,而其他行为4列.

And the following image is the same CSV open in Notepad++. As you can see there is only one delimiter in row 4. Hence this row counted as 2 columns while others are 4 columns.

我想创建一个脚本,该脚本可以在很短的时间内检查CSV数据的每一行,并且如果分隔符与标题不同(如第4行中有2列,但在第1行中显示),则可以检测到缺少的列带4列的标题).所以最终结果将是这样:

I would like create a script that can check every row of CSV data with short time taken and can detect the columns are missing if the delimiter is not same as the header(as shown in row 4 with 2 columns but in row 1 header with 4 columns). So the final result would be like this:

第2行4列通过
第3行4列通行证
第4行2列失败

Row 2 4 Columns Pass
Row 3 4 Columns Pass
Row 4 2 Columns Fail

我当前在完成的脚本中使用COMAPI,但是我想切换为使用Get-Content.但是,引号引起的结果不正确.

I'm currently using COMAPI in my completed script, but I want to switch to using Get-Content. However, the quoted comma leads to an incorrect result.

# sample testing code snippet
$path = "testing.csv"
$delimiter = ","
Get-Content $path | ForEach-Object {($_.Split($delimiter)).Count}

来自同一CSV文件的结果将如下所示:

And the result from the same CSV file will be like this:

4
4
4
3

4
4
4
3

推荐答案

如果您

If you Import-Csv, every item value ($_.PSObject.Properties.Value) is converted to a string, unless the whole item is missing from the column, the .Value property will be set to $Null.

如果您的csv文件包含标题行,则假定标题数至少与一行中的最大列数相同:

If your csv file includes a header row, it is presumed that the number of headers are at least the same as the maximum number of columns in a row:

Import-Csv .\testing.csv | ForEach {@($_.PSObject.Properties | Where {$_.Value -ne $Null}).Count}

如果您的csv文件不包含标题行,则您可以自己添加大量标题:

In case your csv file doesn't include a header row, you might add a large number of headers yourself:

Import-Csv .\testing.csv -Header @(0..99) | ForEach {@($_.PSObject.Properties | Where {$_.Value -ne $Null}).Count}

这篇关于在不使用Excel API的情况下计算CSV列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆