如何仅在awk中使用数字打印某些列 [英] how to print certain column with numbers only in awk
问题描述
我有一个文本文件,其中每行包含不同数量的列.
I have a text file that contains a mix of different number of columns per line.
我只想在该行的第3、4和5列仅包含数字的情况下打印这些行.
I want to only print the lines if columns 3, 4 and 5 of that line only contains number.
诀窍是,第3、4和5列有时会嵌入一个特殊字符("或)",我也想打印这些数字.
The trick is occasionally columns 3, 4 and 5 will have a special character "(", or ")" embedded in them, and I want to print these numbers too.
cat $filename | awk '{ if ( ($3 != "^[0-9]") && ($4 != "^[0-9]") && ($5 != "^[0-9]") ) print $2, $3, $4, $5 }' >>text.dat
但是它还会打印诸如Au2,Cu2等之类的东西.
But it also prints such thing as: Au2, Cu2, etc.
有什么建议吗?
更新:
输入文本文件的相关部分如下所示:
A relevant part of input text file looks like this:
Cu1 Cu 0.00000 0.094635(14) 0.094635(14)
Cu2 Cu 0.00000 0.125943(15) 0.125943(15)
.
.
.
我想要的是以下内容:
Cu 0.00000 0.094635 0.094635
Cu 0.00000 0.125943 0.125943
.
.
.
请注意,"Cu"来自原始输入文件的第二列中的字符串,并且我已经摆脱了第4列和第5列中的数字和括号.还要注意,括号也可能存在于第3列中.括号中的数字可以是一位数字.
Note that "Cu" is from the string in second column from the original input file, and I've gotten rid of the number and parentheses in columns 4 and 5. Note also that the parentheses could exist in column 3 as well. Numbers in the parentheses could be single digit.
推荐答案
在您的代码中:
($3 != "^[0-9]") && ($4 != "^[0-9]") && ($5 != "^[0-9]")
!=
表示not equal to
它不进行正则表达式匹配测试.
!=
means not equal to
it doesn't do regex match testing.
尝试$3~/[0-9]+/ && $4~/[0-9]+/
等等
用于(
或)
问题
您可以做的是,在检查$ 2 $ 3 $ 4上的正则表达式匹配之前,用""
替换这些字段中的所有( or )
然后进行匹配测试.
for the (
or )
problem
what you could do is, before you check regex match on $2 $3 $4, replace all ( or )
in those fields with ""
then do the match testing.
我希望上面的解释足够清楚.
I hope the explanation above is clear enough.
编辑
awk '{for(i=3;i<=5;i++)gsub(/\([^\)]*\)/,"",$i)}$3~/[0-9\.]*/&&$4~/[0-9\.]*/&&$5~/[0-9\.]*/' file
上面的这一行是这样的:
this line above does:
- 从$ 3,$ 4,$ 5删除(...)
- 检查$ 3,$ 4,$ 5是否为数字(或十进制).
- 如果是,请打印出行
以您的示例为例:
kent$ echo "Cu1 Cu 0.00000 0.094635(14) 0.094635(14)
Cu2 Cu 0.00000 0.125943(15) 0.125943(15)"|awk '{for(i=3;i<=5;i++)gsub(/\([^\)]*\)/,"",$i)}$3~/[0-9\.]*/&&$4~/[0-9\.]*/&&$5~/[0-9\.]*/'
Cu1 Cu 0.00000 0.094635 0.094635
Cu2 Cu 0.00000 0.125943 0.125943
仅$ 2,$ 3,$ 4,$ 5:
only $2, $3, $4, $5:
awk '{for(i=3;i<=5;i++)gsub(/\([^\)]*\)/,"",$i);if($3~/[0-9\.]*/&&$4~/[0-9\.]*/&&$5~/[0-9\.]*/)print $2,$3,$4,$5}' file
这篇关于如何仅在awk中使用数字打印某些列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!