使用Awk从定界文件中提取特定列 [英] Extract specific columns from delimited file using Awk
问题描述
对不起,如果这太基础了.我有一个csv文件,其中各列都有标题行(v1,v2等).我知道要提取第1列和第2列,我必须做:awk -F "," '{print $1 "," $2}' infile.csv > outfile.csv
.但是,如果我必须提取第1到10列,20到25列以及30、33列怎么办?作为附录,有什么方法可以直接使用标题名称而不是列号进行提取吗?
Sorry if this is too basic. I have a csv file where the columns have a header row (v1, v2, etc.). I understand that to extract columns 1 and 2, I have to do: awk -F "," '{print $1 "," $2}' infile.csv > outfile.csv
. But what if I have to extract, say, columns 1 to 10, 20 to 25, and 30, 33? As an addendum, is there any way to extract directly with the header names rather than with column numbers?
推荐答案
我不知道是否可以在awk中进行范围设置.您可以执行for循环,但必须添加处理以过滤出不需要的列.这样做可能更容易:
I don't know if it's possible to do ranges in awk. You could do a for loop, but you would have to add handling to filter out the columns you don't want. It's probably easier to do this:
awk -F, '{OFS=",";print $1,$2,$3,$4,$5,$6,$7,$8,$9,$10,$20,$21,$22,$23,$24,$25,$30,$33}' infile.csv > outfile.csv
还有其他需要考虑的问题-而且更快,更简洁:
something else to consider - and this faster and more concise:
cut -d "," -f1-10,20-25,30-33 infile.csv > outfile.csv
关于您问题的第二部分,我可能会在perl中编写一个脚本,该脚本知道如何处理标头行,从stdin或文件中解析列名,然后进行过滤.这可能是我想要用于其他用途的工具.尽管我确定可以做到这一点,但我不确定是否要使用一根衬纸.
As to the second part of your question, I would probably write a script in perl that knows how to handle header rows, parsing the columns names from stdin or a file and then doing the filtering. It's probably a tool I would want to have for other things. I am not sure about doing in a one liner, although I am sure it can be done.
这篇关于使用Awk从定界文件中提取特定列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!