在AWK中删除列选择 [英] Deleting Column Selection in AWK

查看:165
本文介绍了在AWK中删除列选择的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想从CSV文件列表中删除选择的列.由于在shell脚本中使用了awk调用,所以它是内联的.我事先不知道文件有多少列,只是我要删除的列包含在列表的每个文件中.

假设我要删除前4列.删除列值将留下分隔符,我也想删除分隔符.

我可以执行以下操作:创建要删除的列号数组,然后重新创建不包含这些列的对应行.

下面的length(row)值是预期的,但是最终循环仍会遍历原始列数,而不是实际的length(row)值.

头$ f | awk'BEGIN {FS =," ;; split("1,2,3,4",dropers,,")} {split($ 0,row,FS); for(i dropers)delete row [i] ;打印NF," length(row)<<<"; out ="; print NF," length(row)">>>"; for(i = 1; i <= length(row); i ++){print row [i]"lulu"; out = out," row [i]}; sub(/[\ t] * $/,",out);打印出}'> $ g

或格式化:

head $f | awk 'BEGIN{FS=",";split("1,2,3,4",dropers,",")}{split($0,row,FS);for(i in dropers) delete row[i]; print NF "," length(row) "<<<";out=""; print NF "," length(row) ">>>";for(i=1;i<=length(row);i++){print row[i] "lulu"; out = out "," row[i]}; sub(/[ \t]*$/,"",out);print out}'  > $g

这是2个文件的输出:6列进入,当我删除第1列至第4列时剩下2列,但是循环遍历整个6列而不是预期的2列.谢谢您的任何建议. >

Aust.

6,2<<<
6,2>>>
lulu
lulu
lulu
lulu
0000009lulu
461474lulu
,,,,,0000009,461474
6,2<<<
6,2>>>
lulu
lulu
lulu
lulu
0000010lulu
94942lulu
,,,,,0000010,94942

编辑(白俄罗斯)
格式化的代码如下:

BEGIN {FS=",";
       split("1,2,3,4",dropers,",")
      }

{ split($0,row,FS);
  for(i in dropers) delete row[i]; 
  print NF "," length(row) "<<<";
  out=""; 
  print NF "," length(row) ">>>";
  for(i=1;i<=length(row);i++){print row[i] "lulu"; 
                              out = out "," row[i]}; 
  sub(/[ \t]*$/,"",out);
  print out
}

解决方案

BEGIN {FS=",";
       split("1,2,3,4",dropers,",")
      }

{ split($0,row,FS);
  for(i in dropers) delete row[i]; 
  print NF "," length(row) "<<<";
  out=""; 
  print NF "," length(row) ">>>";
  for(i in row){print row[i] "lulu"; 
                out = out "," row[i]}; 
  out = substr(out,2)
  sub(/[ \t]*$/,"",out);
  print out
}

输入:

a,b,c,d,e,f,g

打印:

7,3<<<
7,3>>>
elulu
flulu
glulu
e,f,g

I'd like to delete a selection of columns from a list of CSV files. The awk call is in-line as it is used in a shell script. I don't know beforehand how many columns the files have, only that the columns that I want gone are included in each file of the list.

Let's say I want the first 4 columns removed. Blanking out the column values will leave the separators, which I also want gone.

I though the following would work: create an array of column numbers to drop, and recreate the corresponding row without those columns.

The value of length(row) below is as expected, but the final loop still iterates over the original column count, not the actual length(row) value.

head $f | awk 'BEGIN{FS=",";split("1,2,3,4",dropers,",")}{split($0,row,FS);for(i in dropers) delete row[i]; print NF "," length(row) "<<<";out=""; print NF "," length(row) ">>>";for(i=1;i<=length(row);i++){print row[i] "lulu"; out = out "," row[i]}; sub(/[ \t]*$/,"",out);print out}' > $g

or formatted:

head $f | awk 'BEGIN{FS=",";split("1,2,3,4",dropers,",")}{split($0,row,FS);for(i in dropers) delete row[i]; print NF "," length(row) "<<<";out=""; print NF "," length(row) ">>>";for(i=1;i<=length(row);i++){print row[i] "lulu"; out = out "," row[i]}; sub(/[ \t]*$/,"",out);print out}'  > $g

Here's the output for 2 files: 6 columns going in, 2 left when I've deleted columns 1 through 4, yet the loop iterates over the full 6 cols rather than the expected 2. Thank you for any advice.

Aust.

6,2<<<
6,2>>>
lulu
lulu
lulu
lulu
0000009lulu
461474lulu
,,,,,0000009,461474
6,2<<<
6,2>>>
lulu
lulu
lulu
lulu
0000010lulu
94942lulu
,,,,,0000010,94942

Edit (Belisarius)
Formatted code follows:

BEGIN {FS=",";
       split("1,2,3,4",dropers,",")
      }

{ split($0,row,FS);
  for(i in dropers) delete row[i]; 
  print NF "," length(row) "<<<";
  out=""; 
  print NF "," length(row) ">>>";
  for(i=1;i<=length(row);i++){print row[i] "lulu"; 
                              out = out "," row[i]}; 
  sub(/[ \t]*$/,"",out);
  print out
}

解决方案

BEGIN {FS=",";
       split("1,2,3,4",dropers,",")
      }

{ split($0,row,FS);
  for(i in dropers) delete row[i]; 
  print NF "," length(row) "<<<";
  out=""; 
  print NF "," length(row) ">>>";
  for(i in row){print row[i] "lulu"; 
                out = out "," row[i]}; 
  out = substr(out,2)
  sub(/[ \t]*$/,"",out);
  print out
}

with input:

a,b,c,d,e,f,g

prints:

7,3<<<
7,3>>>
elulu
flulu
glulu
e,f,g

这篇关于在AWK中删除列选择的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆