如何仅在awk中使用数字打印某些列 [英] how to print certain column with numbers only in awk

查看:141
本文介绍了如何仅在awk中使用数字打印某些列的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个文本文件,其中每行包含不同数量的列.

I have a text file that contains a mix of different number of columns per line.

我只想在该行的第3、4和5列仅包含数字的情况下打印这些行.

I want to only print the lines if columns 3, 4 and 5 of that line only contains number.

诀窍是,第3、4和5列有时会嵌入一个特殊字符("或)",我也想打印这些数字.

The trick is occasionally columns 3, 4 and 5 will have a special character "(", or ")" embedded in them, and I want to print these numbers too.

cat $filename | awk '{ if ( ($3 != "^[0-9]") && ($4 != "^[0-9]") && ($5 != "^[0-9]") ) print $2, $3, $4, $5 }' >>text.dat

但是它还会打印诸如Au2,Cu2等之类的东西.

But it also prints such thing as: Au2, Cu2, etc.

有什么建议吗?

更新:

输入文本文件的相关部分如下所示:

A relevant part of input text file looks like this:

Cu1 Cu 0.00000 0.094635(14) 0.094635(14)
Cu2 Cu 0.00000 0.125943(15) 0.125943(15)
.
.
.

我想要的是以下内容:

Cu 0.00000 0.094635 0.094635
Cu 0.00000 0.125943 0.125943
.
.
.

请注意,"Cu"来自原始输入文件的第二列中的字符串,并且我已经摆脱了第4列和第5列中的数字和括号.还要注意,括号也可能存在于第3列中.括号中的数字可以是一位数字.

Note that "Cu" is from the string in second column from the original input file, and I've gotten rid of the number and parentheses in columns 4 and 5. Note also that the parentheses could exist in column 3 as well. Numbers in the parentheses could be single digit.

推荐答案

在您的代码中:

 ($3 != "^[0-9]") && ($4 != "^[0-9]") && ($5 != "^[0-9]") 

!=表示not equal to它不进行正则表达式匹配测试.

!= means not equal to it doesn't do regex match testing.

尝试$3~/[0-9]+/ && $4~/[0-9]+/等等

用于()问题 您可以做的是,在检查$ 2 $ 3 $ 4上的正则表达式匹配之前,用""替换这些字段中的所有( or )然后进行匹配测试.

for the ( or ) problem what you could do is, before you check regex match on $2 $3 $4, replace all ( or ) in those fields with "" then do the match testing.

我希望上面的解释足够清楚.

I hope the explanation above is clear enough.

编辑

awk '{for(i=3;i<=5;i++)gsub(/\([^\)]*\)/,"",$i)}$3~/[0-9\.]*/&&$4~/[0-9\.]*/&&$5~/[0-9\.]*/' file

上面的这一行是这样的:

this line above does:

  • 从$ 3,$ 4,$ 5删除(...)
  • 检查$ 3,$ 4,$ 5是否为数字(或十进制).
  • 如果是,请打印出行

以您的示例为例:

kent$  echo "Cu1 Cu 0.00000 0.094635(14) 0.094635(14)
Cu2 Cu 0.00000 0.125943(15) 0.125943(15)"|awk '{for(i=3;i<=5;i++)gsub(/\([^\)]*\)/,"",$i)}$3~/[0-9\.]*/&&$4~/[0-9\.]*/&&$5~/[0-9\.]*/'                                               
Cu1 Cu 0.00000 0.094635 0.094635
Cu2 Cu 0.00000 0.125943 0.125943

仅$ 2,$ 3,$ 4,$ 5:

only $2, $3, $4, $5:

awk '{for(i=3;i<=5;i++)gsub(/\([^\)]*\)/,"",$i);if($3~/[0-9\.]*/&&$4~/[0-9\.]*/&&$5~/[0-9\.]*/)print $2,$3,$4,$5}' file

这篇关于如何仅在awk中使用数字打印某些列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆