在awk中合并和处理2个制表符分隔的文件,然后制作一个新文件 [英] combining and processing 2 tab separated files in awk and make a new one

查看:103
本文介绍了在awk中合并和处理2个制表符分隔的文件,然后制作一个新文件的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有2个具有2列的tab separated文件.第1列是数字,第2列是ID.像下面两个例子一样:

I have 2 tab separated files with 2 columns. column1 1 is number and column 2 is ID. like these 2 examples:

示例文件1:

188 TPT1
133 ACTR2
420 ATP5C1
942 DNAJA1

示例文件1:

91  PSMD7
2217    TPT1
223 ATP5C1
156 TCP1

我想根据第2列(列ID)找到2个文件的公共行,并创建一个新的制表符分隔的文件,其中有4列:column1是ID(公共ID)column2是file1中的编号,column3是file2中的数字,column4是第2列和第3列之比的log2值(表示log2(column2/column3)).例如,关于ID"TPT1":第1列是TPT1,第2列是188,第3列是2217,第4列是log2(188/2217),它等于-3.561494. 这是预期的输出:

I want to find the common rows of 2 files based on column 2 (column ID) and make a new tab separated file in which there are 4 columns: column1 is ID (common ID) column2 is the number from file1, column3 is the number from file2 and column4 is the log2 values of ratio of columns 2 and 3 (which means log2(column2/column3)). for example regarding the ID "TPT1": 1st column is TPT1, column2 is 188, column3 is 2217 and column 4 is log2(188/2217) which is equal to -3.561494. here is a the expected output:

预期输出:

TPT1    188 2217    -3.561494
ATP5C1  420 223 0.9133394

我正在尝试使用以下代码在AWK中做到这一点:

I am trying to do that in AWK using the following code:

awk 'NR==FNR { n[$2]=$0;next } ($2 in n) { print n[$2 '\t' $1] '\t' $1 '\t' log(n[$1]/$1)}' file1.txt file2.txt  > result.txt

此代码未返回我期望的结果.你知道如何解决吗?

this code does not return what I expect. do you know how to fix it?

推荐答案

$ awk -v OFS="\t" 'NR==FNR {n[$2]=$1;next} ($2 in n) {print $2, $1, n[$2], log(n[$2]/$1)/log(2)}' file1 file2 
TPT1    2217    188  -3.5598
ATP5C1  223     420  0.913346

这篇关于在awk中合并和处理2个制表符分隔的文件,然后制作一个新文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆