删除其在CSV拥有超过x列排 [英] Delete row which has more than X columns in a csv

查看:117
本文介绍了删除其在CSV拥有超过x列排的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我要删除所有CSV文件具有超过一定数量的列多行。

I need to delete all the rows in a csv file which have more than a certain number of columns.

这是因为有时code,生成CSV文件,跳过一些值,并打印在同一行以下。

This happens because sometimes the code, which generates the csv file, skips some values and prints the following on the same line.

例如:考虑以下文件进行解析。我想删除所有具有3个以上的列(即,标题的列)的行

Example: Consider the following file to parse. I want to remove all the rows which have more than 3 columns (i.e. the columns of the header):

timestamp,header2,header3
1,1val2,1val3
2,2val2,2val3
3,4,4val2,4val3
5val1,5val2,5val3
6,6val2,6val3

我想有输出文件是:

The output file I would like to have is:

timestamp,header2,header3
1,1val2,1val3
2,2val2,2val3
5val1,5val2,5val3
6,6val2,6val3

如果时间戳4行缺少我不在乎。

I don't care if the row with timestamp 4 is missing.

我想preFER在bash或者用awk,而不是蟒蛇之一,这样我可以学习如何使用它的解决方案。

I would prefer a solution in bash or perhaps using awk, rather than a python one, so that I can learn how to use it.

推荐答案

这是可以做到直线前进与 AWK

This can be done straight forward with awk:

awk -F, 'NF<=3' file

本使用 AWK 变量 NF 保存当前行的字段数。既然我们已经字段分隔符设置为逗号(与 -F 或同等学历, -v FS =,),那么它只是一个检查时字段数不高于3。这与 NF&LT做的事; = 3 :如果这是真的,线路将被自动打印出来。

This uses the awk variable NF that holds the number of fields in the current line. Since we have set the field separator to the comma (with -F, or, equivalent, -v FS=","), then it is just a matter of checking when the number of fields is not higher than 3. This is done with NF<=3: when this is true, the line will be printed automatically.

$ awk -F, 'NF<=3' a
timestamp,header2,header3
1,1val2,1val3
2,2val2,2val3
5val1,5val2,5val3
6,6val2,6val3

这篇关于删除其在CSV拥有超过x列排的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆