使用 Awk 从分隔文件中提取特定列 [英] Extract specific columns from delimited file using Awk

查看:30
本文介绍了使用 Awk 从分隔文件中提取特定列的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

抱歉,这太基本了.我有一个 csv 文件,其中的列有一个标题行(v1、v2 等).我知道要提取第 1 列和第 2 列,我必须这样做: awk -F "," '{print $1 "," $2}' infile.csv >输出文件.csv.但是如果我必须提取第 1 到 10、20 到 25 和 30、33 列怎么办?作为附录,有没有办法直接用标题名而不是列号提取?

Sorry if this is too basic. I have a csv file where the columns have a header row (v1, v2, etc.). I understand that to extract columns 1 and 2, I have to do: awk -F "," '{print $1 "," $2}' infile.csv > outfile.csv. But what if I have to extract, say, columns 1 to 10, 20 to 25, and 30, 33? As an addendum, is there any way to extract directly with the header names rather than with column numbers?

推荐答案

我不知道是否可以在 awk 中执行范围.你可以做一个 for 循环,但你必须添加处理来过滤掉你不想要的列.这样做可能更容易:

I don't know if it's possible to do ranges in awk. You could do a for loop, but you would have to add handling to filter out the columns you don't want. It's probably easier to do this:

awk -F, '{OFS=",";print $1,$2,$3,$4,$5,$6,$7,$8,$9,$10,$20,$21,$22,$23,$24,$25,$30,$33}' infile.csv > outfile.csv

需要考虑的其他事项 - 而且更快更简洁:

something else to consider - and this faster and more concise:

cut -d "," -f1-10,20-25,30-33 infile.csv > outfile.csv

至于您问题的第二部分,我可能会在 perl 中编写一个脚本,该脚本知道如何处理标题行,从标准输入或文件中解析列名称,然后进行过滤.这可能是我想要用于其他事情的工具.我不确定是否可以在一个班轮中进行,尽管我确信可以做到.

As to the second part of your question, I would probably write a script in perl that knows how to handle header rows, parsing the columns names from stdin or a file and then doing the filtering. It's probably a tool I would want to have for other things. I am not sure about doing in a one liner, although I am sure it can be done.

这篇关于使用 Awk 从分隔文件中提取特定列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆