在awk中合并和处理2个制表符分隔的文件,然后制作一个新文件 [英] combining and processing 2 tab separated files in awk and make a new one
问题描述
我有2个具有2列的tab separated
文件.第1列是数字,第2列是ID.像下面两个例子一样:
I have 2 tab separated
files with 2 columns. column1 1 is number and column 2 is ID. like these 2 examples:
示例文件1:
188 TPT1
133 ACTR2
420 ATP5C1
942 DNAJA1
示例文件1:
91 PSMD7
2217 TPT1
223 ATP5C1
156 TCP1
我想根据第2列(列ID)找到2个文件的公共行,并创建一个新的制表符分隔的文件,其中有4列:column1是ID(公共ID)column2是file1中的编号,column3是file2中的数字,column4是第2列和第3列之比的log2值(表示log2(column2/column3)).例如,关于ID"TPT1":第1列是TPT1,第2列是188,第3列是2217,第4列是log2(188/2217),它等于-3.561494. 这是预期的输出:
I want to find the common rows of 2 files based on column 2 (column ID) and make a new tab separated file in which there are 4 columns: column1 is ID (common ID) column2 is the number from file1, column3 is the number from file2 and column4 is the log2 values of ratio of columns 2 and 3 (which means log2(column2/column3)). for example regarding the ID "TPT1": 1st column is TPT1, column2 is 188, column3 is 2217 and column 4 is log2(188/2217) which is equal to -3.561494. here is a the expected output:
预期输出:
TPT1 188 2217 -3.561494
ATP5C1 420 223 0.9133394
我正在尝试使用以下代码在AWK
中做到这一点:
I am trying to do that in AWK
using the following code:
awk 'NR==FNR { n[$2]=$0;next } ($2 in n) { print n[$2 '\t' $1] '\t' $1 '\t' log(n[$1]/$1)}' file1.txt file2.txt > result.txt
此代码未返回我期望的结果.你知道如何解决吗?
this code does not return what I expect. do you know how to fix it?
推荐答案
$ awk -v OFS="\t" 'NR==FNR {n[$2]=$1;next} ($2 in n) {print $2, $1, n[$2], log(n[$2]/$1)/log(2)}' file1 file2
TPT1 2217 188 -3.5598
ATP5C1 223 420 0.913346
这篇关于在awk中合并和处理2个制表符分隔的文件,然后制作一个新文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!