在AWK中删除列选择 [英] Deleting Column Selection in AWK
问题描述
我想从CSV文件列表中删除选择的列.由于在shell脚本中使用了awk调用,所以它是内联的.我事先不知道文件有多少列,只是我要删除的列包含在列表的每个文件中.
假设我要删除前4列.删除列值将留下分隔符,我也想删除分隔符.
我可以执行以下操作:创建要删除的列号数组,然后重新创建不包含这些列的对应行.
下面的length(row)值是预期的,但是最终循环仍会遍历原始列数,而不是实际的length(row)值.
头$ f | awk'BEGIN {FS =," ;; split("1,2,3,4",dropers,,")} {split($ 0,row,FS); for(i dropers)delete row [i] ;打印NF," length(row)<<<"; out ="; print NF," length(row)">>>"; for(i = 1; i <= length(row); i ++){print row [i]"lulu"; out = out," row [i]}; sub(/[\ t] * $/,",out);打印出}'> $ g
或格式化:
head $f | awk 'BEGIN{FS=",";split("1,2,3,4",dropers,",")}{split($0,row,FS);for(i in dropers) delete row[i]; print NF "," length(row) "<<<";out=""; print NF "," length(row) ">>>";for(i=1;i<=length(row);i++){print row[i] "lulu"; out = out "," row[i]}; sub(/[ \t]*$/,"",out);print out}' > $g
这是2个文件的输出:6列进入,当我删除第1列至第4列时剩下2列,但是循环遍历整个6列而不是预期的2列.谢谢您的任何建议. >
Aust.
6,2<<<
6,2>>>
lulu
lulu
lulu
lulu
0000009lulu
461474lulu
,,,,,0000009,461474
6,2<<<
6,2>>>
lulu
lulu
lulu
lulu
0000010lulu
94942lulu
,,,,,0000010,94942
编辑(白俄罗斯)
格式化的代码如下:
BEGIN {FS=",";
split("1,2,3,4",dropers,",")
}
{ split($0,row,FS);
for(i in dropers) delete row[i];
print NF "," length(row) "<<<";
out="";
print NF "," length(row) ">>>";
for(i=1;i<=length(row);i++){print row[i] "lulu";
out = out "," row[i]};
sub(/[ \t]*$/,"",out);
print out
}
BEGIN {FS=",";
split("1,2,3,4",dropers,",")
}
{ split($0,row,FS);
for(i in dropers) delete row[i];
print NF "," length(row) "<<<";
out="";
print NF "," length(row) ">>>";
for(i in row){print row[i] "lulu";
out = out "," row[i]};
out = substr(out,2)
sub(/[ \t]*$/,"",out);
print out
}
输入:
a,b,c,d,e,f,g
打印:
7,3<<<
7,3>>>
elulu
flulu
glulu
e,f,g
I'd like to delete a selection of columns from a list of CSV files. The awk call is in-line as it is used in a shell script. I don't know beforehand how many columns the files have, only that the columns that I want gone are included in each file of the list.
Let's say I want the first 4 columns removed. Blanking out the column values will leave the separators, which I also want gone.
I though the following would work: create an array of column numbers to drop, and recreate the corresponding row without those columns.
The value of length(row) below is as expected, but the final loop still iterates over the original column count, not the actual length(row) value.
head $f | awk 'BEGIN{FS=",";split("1,2,3,4",dropers,",")}{split($0,row,FS);for(i in dropers) delete row[i]; print NF "," length(row) "<<<";out=""; print NF "," length(row) ">>>";for(i=1;i<=length(row);i++){print row[i] "lulu"; out = out "," row[i]}; sub(/[ \t]*$/,"",out);print out}' > $g
or formatted:
head $f | awk 'BEGIN{FS=",";split("1,2,3,4",dropers,",")}{split($0,row,FS);for(i in dropers) delete row[i]; print NF "," length(row) "<<<";out=""; print NF "," length(row) ">>>";for(i=1;i<=length(row);i++){print row[i] "lulu"; out = out "," row[i]}; sub(/[ \t]*$/,"",out);print out}' > $g
Here's the output for 2 files: 6 columns going in, 2 left when I've deleted columns 1 through 4, yet the loop iterates over the full 6 cols rather than the expected 2. Thank you for any advice.
Aust.
6,2<<<
6,2>>>
lulu
lulu
lulu
lulu
0000009lulu
461474lulu
,,,,,0000009,461474
6,2<<<
6,2>>>
lulu
lulu
lulu
lulu
0000010lulu
94942lulu
,,,,,0000010,94942
Edit (Belisarius)
Formatted code follows:
BEGIN {FS=",";
split("1,2,3,4",dropers,",")
}
{ split($0,row,FS);
for(i in dropers) delete row[i];
print NF "," length(row) "<<<";
out="";
print NF "," length(row) ">>>";
for(i=1;i<=length(row);i++){print row[i] "lulu";
out = out "," row[i]};
sub(/[ \t]*$/,"",out);
print out
}
BEGIN {FS=",";
split("1,2,3,4",dropers,",")
}
{ split($0,row,FS);
for(i in dropers) delete row[i];
print NF "," length(row) "<<<";
out="";
print NF "," length(row) ">>>";
for(i in row){print row[i] "lulu";
out = out "," row[i]};
out = substr(out,2)
sub(/[ \t]*$/,"",out);
print out
}
with input:
a,b,c,d,e,f,g
prints:
7,3<<<
7,3>>>
elulu
flulu
glulu
e,f,g
这篇关于在AWK中删除列选择的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!