BASH - 使用Loop和If语句汇总来自唯一字段中多个字段的信息 [英] BASH - Summarising information from several fields in unique field using Loop and If statements

查看：121 发布时间：2018/7/17 9:31:56 bash loops if-statement awk multiple-columns

本文介绍了BASH - 使用Loop和If语句汇总来自唯一字段中多个字段的信息的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有以下标签分隔文件：

  A1 A1 0 0 2 1 1 1 1 1 1 1 2 1 1 1 
 A2 A2 0 0 2 1 1 1 1 1 1 1 1 1 1 
 A3 A3 0 0 2 2 1 1 2 2 1 1 1 1 1 
 A5 A5 0 0 2 2 1 1 1 1 1 1 1 2 1 1

想法是总结列之间的信息7（包含）和在文件末尾添加的新列中的结尾。

为此，这些是规则：

如果行中（第7列和结尾之间）的2总数为 0 ：将1 1添加到新的最后一列

如果行中的2总数（第7列和结尾之间）为 1 ：将1 2添加到新的最后一列

如果总数为行中2的r（第7列和结尾之间） 2或更多：将2 2添加到新的最后一列

我开始使用命令提取我想要处理的列：

awk'{for（i = 7; i< = NF; i ++）printf $ i; print}'myfile.ped> tmp_myfile.txt

然后我使用以下方法计算每行中的出现次数：

sed's / [^ 2] // g'tmp_myfile.txtt | awk'{print NR，length}'>
tmp_occurences.txt

哪些输出：

然后我的想法是编写一个循环遍历行的for循环来添加新的汇总列。
我正在考虑这种结构，基于我在这里找到的东西： http://www.thegeekstuff.com/2010/06/bash-if-statement-examples ：

<$读取行时p $ p>

; 
 do 
 set $ line 
 
如果[$ 2== 0] 
则
 $ 3 ==1 1
 
 elif [$ 2== 1] 
然后
 $ 3 ==1 2
 
 elif [$ 2> = 2] 
然后
 $ 3 ==2 2
 
其他
打印[错误] 
 
 fi 
完成< tmp_occurences.txt

但是我被困在这里。我是否必须在开始循环之前创建新列？我是朝着正确的方向前进？

理想情况下，最终输出（在合并初始文件和摘要列的前6列之后）将是：

  A1 A1 0 0 2 1 1 2 
 A2 A2 0 0 2 1 1 1 
 A3 A3 0 0 2 2 2 2 
 A5 A5 0 0 2 2 1 2

感谢您的帮助！

解决方案

使用gnu-aw你可以这样做：

  awk -v OFS ='\ t''{
 c = 0; 
 for（i = 7; i< = NF; i ++）
 if（$ i == 2）
 c ++ 
 if（c == 0）
s = 1 1
否则if（c == 1）
s =1 2
 else 
s =2 2
 NF = 6 
打印$ 0，s 
}'档案
 
 A1 A1 0 0 2 1 1 2 
 A2 A2 0 0 2 1 1 1 
 A3 A3 0 0 2 2 2 2 
 A5 A5 0 0 2 2 1 2

PS：如果不使用gnu-awk你可以使用：

  awk -v OFS ='\ t''{c = 0; for（i = 7; i< = NF; i ++）{if（$ i == 2）c ++; $ i =} if（c == 0）s =1 1;否则如果（c == 1）s =1 2;否则s =2 2; NF = 6;打印$ 0，s}'文件

I have the following tab-separated file:

A1      A1      0       0       2       1       1 1     1 1     1 1     2 1     1 1
A2      A2      0       0       2       1       1 1     1 1     1 1     1 1     1 1
A3      A3      0       0       2       2       1 1     2 2     1 1     1 1     1 1
A5      A5      0       0       2       2       1 1     1 1     1 1     1 2     1 1

The idea is to summarise the information between column 7 (included) and the end in a new column that is added at the end of the file.

To do so, these are the rules:

If the total number of "2"s in the row (between column 7 and the end) is 0: add "1 1" to the new last column
If the total number of "2"s in the row (between column 7 and the end) is 1: add "1 2" to the new last column
If the total number of "2"s in the row (between column 7 and the end) is 2 or more: add "2 2" to the new last column

I started to extract the columns I want to work on using the command:

awk '{for (i = 7; i <= NF; i++) printf $i " "; print ""}' myfile.ped > tmp_myfile.txt

Then I count the number of occurrence in each row using:

sed 's/[^2]//g' tmp_myfile.txtt | awk '{print NR, length }' > tmp_occurences.txt

Which outputs:

Then my idea was to write a for loop that loops through the lines to add the new summary column. I was thinking in this kind of structure, based on what I found here: http://www.thegeekstuff.com/2010/06/bash-if-statement-examples:

while read line ;
do
    set $line

    If ["$2"==0]
    then
        $3=="1 1"

    elif ["$2"==1 ]
    then
        $3=="1 2"

    elif ["$2">=2 ]
    then 
        $3=="2 2"

    else
        print ["error"]

    fi
done < tmp_occurences.txt

But I am stuck here. Do I have to create the new column before starting the loop? Am I going in the right direction?

Ideally, the final output (after merging the first 6 columns from the initial file and the summary column) would be:

A1      A1      0       0       2       1       1 2
A2      A2      0       0       2       1       1 1
A3      A3      0       0       2       2       2 2
A5      A5      0       0       2       2       1 2

Thank you for your help!

解决方案

Using gnu-awk you can do:

awk -v OFS='\t' '{
   c=0;
   for (i=7; i<=NF; i++)
      if ($i==2)
         c++
   if (c==0)
      s="1 1"
   else if (c==1)
      s="1 2"
   else
      s="2 2"
   NF=6
   print $0, s
}' file

A1  A1  0   0   2   1   1 2
A2  A2  0   0   2   1   1 1
A3  A3  0   0   2   2   2 2
A5  A5  0   0   2   2   1 2

PS: If not using gnu-awk you can use:

awk -v OFS='\t' '{c=0; for (i=7; i<=NF; i++) {if ($i==2) c++; $i=""} if (c==0) s="1 1"; else if (c==1) s="1 2"; else s="2 2"; NF=6; print $0, s}' file

这篇关于BASH - 使用Loop和If语句汇总来自唯一字段中多个字段的信息的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

BASH - 使用Loop和If语句汇总来自唯一字段中多个字段的信息 [英] BASH - Summarising information from several fields in unique field using Loop and If statements

问题描述

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

BASH - 使用Loop和If语句汇总来自唯一字段中多个字段的信息 [英] BASH - Summarising information from several fields in unique field using Loop and If statements

问题描述

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭