如果找到匹配项,如何比较两个文件中的多列并从另一列中检索相应的值 [英] How to compare multiple columns in two files and retrieve the corresponding value from another column if match found

查看:143
本文介绍了如果找到匹配项,如何比较两个文件中的多列并从另一列中检索相应的值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有两个文件File1.txt和File2.txt.我需要分别比较File1的三个1、2和3列与File2的4,5和6,如果找到匹配项,我想从File2的第2列中检索相应的值并将其粘贴到Output中.文本.示例文件如下:

I have two files File1.txt and File2.txt. I need to compare the three columns 1, 2 and 3 from File1 with 4,5 and 6 of File2, respectively and if the match is found, I want to retrieve the corresponding value from the column 2 of File2 and paste it in Output.txt. The sample files are as below:

File1.txt

File1.txt

ASP B   276
ASN B   290
ALA B   294
ALA B   297
ARG B   298
ARG B   303
LYS D   288

File2.txt

File2.txt

ATOM    4770    N   ALA A   346 71.417  37.005  4.562   1   0   N
ATOM    4778    C   ALA A   346 72.003  34.855  3.476   1   0   C
ATOM    4779    O   ALA A   346 72.956  34.2    3.103   1   0   O
ATOM    4859    N   SER A   353 78.218  33.415  -2.595  1   0   N
ATOM    4867    HG  SER A   353 78.828  31.548  0.899   1   0   
ATOM    4868    C   SER A   353 79.637  31.351  -2.619  1   0   C
ATOM    4869    O   SER A   353 80.669  30.76   -2.372  1   0   O
ATOM    9771    N   ASP B   238 52.86   30.061  -7.031  1   0   N
ATOM    9772    H   ASP B   238 53.651  30.105  -7.641  1   0   
ATOM    9776    HB1 ASP B   238 53.516  32.92   -5.486  1   0   
ATOM    10352   H   ASP B   276 11.565  35.255  6.968   1   0   
ATOM    10356   HB1 ASP B   276 10.084  33.659  6.727   1   0   
ATOM    10357   HB2 ASP B   276 10.331  32.059  6.945   1   0   
ATOM    10358   CG  ASP B   276 9.946   33.07   8.681   1   0   C
ATOM    10453   H   ASN B   290 16.73   30.519  13.339  1   0   
ATOM    10454   CA  ASN B   290 18.755  31.013  13.763  1   0   C
ATOM    10458   HB2 ASN B   290 20.105  29.465  13.891  1   0   
ATOM    10459   CG  ASN B   290 18.471  28.842  14.99   1   0   C
ATOM    10460   OD1 ASN B   290 18.246  29.429  16.072  1   0   O
ATOM    10512   H   ALA B   294 24.099  33.167  8.943   1   0   
ATOM    10513   CA  ALA B   294 26.095  33.794  9.273   1   0   C
ATOM    10514   HA  ALA B   294 26.597  34.261  8.545   1   0   
ATOM    10515   CB  ALA B   294 25.515  34.817  10.199  1   0   C
ATOM    10556   H   ALA B   297 28.288  31.299  7.752   1   0   
ATOM    10557   CA  ALA B   297 30.202  31.869  7.061   1   0   C
ATOM    10558   HA  ALA B   297 30.566  31.457  6.226   1   0   
ATOM    10566   H   ARG B   298 30.012  32.059  9.568   1   0   
ATOM    10567   CA  ARG B   298 31.961  32.047  10.392  1   0   C
ATOM    10568   HA  ARG B   298 32.532  32.853  10.237  1   0   
ATOM    10569   CB  ARG B   298 31.251  32.167  11.74   1   0   C
ATOM    10650   HE  ARG B   303 36.405  23.564  2.394   1   0   
ATOM    10651   CZ  ARG B   303 34.807  22.582  3.07    1   0   C
ATOM    10652   NH1 ARG B   303 33.867  22.493  3.991   1   0   N
ATOM    10653  1HH1 ARG B   303 33.829  23.162  4.733   1   0   
ATOM    10654  2HH1 ARG B   303 33.192  21.757  3.947   1   0   
ATOM    10655   NH2 ARG B   303 34.847  21.706  2.081   1   0   N
ATOM    17143   OE1 GLU C   295 59.322  13.561  -6.631  1   0   O
ATOM    17144   OE2 GLU C   295 57.646  14.02   -7.941  1   0   O
ATOM    17145   C   GLU C   295 54.718  13.527  -3.448  1   0   C
ATOM    17146   O   GLU C   295 54.509  14.618  -2.982  1   0   O
ATOM    23627   HB1 LYS D   288 32.909  52.854  29.282  1   0   
ATOM    23628   HB2 LYS D   288 31.41   53.372  29.672  1   0   
ATOM    23629   CG  LYS D   288 32.811  53.749  31.138  1   0   C
ATOM    23630   HG1 LYS D   288 32.137  53.82   31.873  1   0   
ATOM    23631   HG2 LYS D   288 33.636  53.303  31.484  1   0   
ATOM    23632   CD  LYS D   288 33.168  55.146  30.656  1   0   C

仅当File1的三列与File2的三列匹配时,output.txt才应包含第二个文件的第二列的值.

The output.txt should contain the values of column 2 of second file only if the three columns of File1 matches with the three columns of File2.

Output.txt

Output.txt

10352
10356
10357
10358
10453
10454
10458
10459
10460
10512
10513
10514
10515
10556
10557
10558
10566
10567
10568
10569
10650
10651
10652
10653
10654
10655
23627
23628
23629
23630
23631
23632

我尝试用awk一支衬板,如下所示.该脚本已执行,但已检索到File2的第2列的不同值.因此,在解决这个问题并找出我要​​去哪里的问题上,我需要帮助.

I have tried with awk one liner, which is provided below. The script was executed but has retrieved different values of the column 2 of File2. Hence, I need help in resolving this and finding out where am I going wrong.

awk 'FNR==NR{a[$1,$2,$3]=$0;next}{if(b=a[$4,$5,$6]){print $2}}' File1.txt File2.txt > Output.txt

谢谢.

阿莎, MBU,IISc, 印度班加罗尔

Asha, MBU, IISc, Bangalore, India

推荐答案

您可以使用以下awk命令:

You can use this awk command:

awk  'FNR==NR{a[$1,$2,$3]; next} ($4,$5,$6) in a{print $2}' file1 file2

10352
10356
10357
10358
10453
10454
10458
10459
10460
10512
10513
10514
10515
10556
10557
10558
10566
10567
10568
10569
10650
10651
10652
10653
10654
10655
23627
23628
23629
23630
23631
23632

这篇关于如果找到匹配项,如何比较两个文件中的多列并从另一列中检索相应的值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆