AWK与文件2和平均场7匹配文件1 [英] awk to match file 1 with file 2 and average field 7

查看：157 发布时间：2016/7/28 16:49:10 awk

本文介绍了AWK与文件2和平均场7匹配文件1的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我想匹配所有的在文件1 名文件2 和平均他们，如果有匹配。那里的比赛将是本场是 $ 5 在 | 标志和平均的总和 $ 7 匹配 $ 4'/ code>。谢谢：）


I am trying to match all the file 1 names in file 2 and average them if there is a match.  The field where the match will be is $5 before the | symbol and the average is the sum of $7 that matches $4.  Thank you :).
 文件1  
AGRN 
CYP2J2

 文件2  
chr1    955543  955763  chr1:955543 AGRN-6|gc=75    1   2
chr1    955543  955763  chr1:955543 AGRN-6|gc=75    2   2
chr1    955543  955763  chr1:955543 AGRN-6|gc=75    3   2
chr1    957571  957852  chr1:957571 AGRN-7|gc=61.2  1   148
chr1    957571  957852  chr1:957571 AGRN-7|gc=61.2  2   149
chr1    957571  957852  chr1:957571 AGRN-7|gc=61.2  3   151
chr1    60381600    60381782    chr1:60381600   CYP2J2-1596|gc=40.7 153 274
chr1    60381600    60381782    chr1:60381600   CYP2J2-1596|gc=40.7 154 273

 所需的输出（制表符分隔） 
chr1:955543     AGRN-6     2
chr1:957571     AGRN       149.3
chr1:60381600   CYP2J2-1596     153.5

我到目前为止已经试过：
I have tried so far:
awk '
 FNR==NR{d[$0]; next;}          
 {                              
     for(k in d){               
         pat="(^|;)"k":";       
         if($5 ~ pat){
             print;             
             break;
         }
     }
 }' file 1 file2 > output.bed

的 AWK 确实运行，但输出文件，截至目前，为0字节。谢谢：）
The awk does run but the output file, as of now, is 0 bytes.  Thank you :).
推荐答案
脚本应该是这样的：
的 test.awk 的
BEGIN {
  FS="[ \t|]*"
}
# Read search terms from file1 into 's'
FNR==NR {
    s[$0]
    next
}
{
    # Check if $5 matches one of the search terms
    for(i in s) {
        if($5 ~ i) {

            # Store first two fields for later usage
            a[$5]=$1
            b[$5]=$2

            # Add $9 to total of $9 per $5
            t[$5]+=$8
            # Increment count of occurences of $5
            c[$5]++

            next
        }
    }
}
END {

    # Calculate average and print output for all search terms
    # that has been found
    for( i in t ) {
        avg = t[i] / c[i]
        printf("%s:%s\t%s\t%s\n", a[i], b[i], i, avg)
    }
}

调用它：
awk -f test.awk file1 file2

顺便说一下，在你的预期产出的第三平均是错误的。输出应该是这样的：
Btw, the third avg in your expected output is wrong. The output should look like this:
chr1:955543 AGRN-6  2
chr1:957571 AGRN-7  149.333
chr1:60381600   CYP2J2-1596 273.5


                        这篇关于AWK与文件2和平均场7匹配文件1的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！


                    
                        查看全文

AWK与文件2和平均场7匹配文件1 [英] awk to match file 1 with file 2 and average field 7

问题描述

推荐答案

相关文章

Linux/Unix最新文章

热门教程

热门工具

登录关闭

AWK与文件2和平均场7匹配文件1 [英] awk to match file 1 with file 2 and average field 7

问题描述

推荐答案

相关文章

Linux/Unix最新文章

热门教程

热门工具

登录 关闭

登录关闭