比较不同文件的多列并在匹配时从文件中附加一列 [英] Comparing multiple columns of different files and appending a column from a file if there is a match

查看:33
本文介绍了比较不同文件的多列并在匹配时从文件中附加一列的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在 awk 中访问文件的列时遇到问题.我有两个文件,一个有 12 列,另一个有 5 列.

I am having a problem while accessing the columns of a file in awk. I have two files, one has 12 columns and the other has 5 columns.

1.txt
chr1 10 20 . . + chr1 30 40 ABC . +
chr2 11 22 . . + chr2 90 92 XXX . -
chrX 33 42 . . + chrX 70 80 XXX . +
chr4 3  12 . . + chr4 70 80 ZZZ . +

还有,

2.txt
1 chr1 30 40 ABC
3 chr1 35 40 ABC
27 chr2 90 92 XXX
1 chrX 70 80 XXX
2 chrY 12 13 XXX

我想将2.txt的第2、3、4、5列与1.txt的第7、8、9、10列进行比较.如果有匹配,它应该打印1.txt的整行,以及2.txt的第一列.

I want to compare the 2nd,3rd,4th and 5th column of 2.txt with 7th,8th,9th,10th of 1.txt. If there is a match, it should print the whole line of 1.txt, and the 1st column of 2.txt.

预期输出:

chr1 10 20 . . + chr1 30 40 ABC . + 1
chr2 11 22 . . + chr2 90 92 XXX . - 27
chrX 33 42 . . + chrX 70 80 XXX . + 1

因为我无法比较 4 列,所以我做了两列.而且,我能够比较每一列的两列(2.txt 的第 2 和第 3 列以及 1.txt 的第 7 和第 8 列),并且我可以打印一个字符串如果有比赛.但我无法打印第一个文件的第一列.我的代码:

As I could not compare the 4 columns, I did it with two. And, I am able to compare the two columns of each (2nd and 3rd of 2.txt and 7th and 8th of 1.txt), and I can print a string if there is a match. But I cannot print the first column of first file. My code:

awk -F, 'NR==FNR {a[$2 FS $3];next} {print $0 FS (($7 FS $8) in a?"exists":"none")}' 2.txt 1.txt

它做了什么(我不想要的):

What it makes (which I don't want):

chr1 10 20 . . + chr1 30 40 ABC . + exists
chr2 11 22 . . + chr2 90 92 XXX . - exists
chrX 33 42 . . + chrX 70 80 XXX . + exists
chr4 3  12 . . + chr4 70 80 ZZZ . + none

如何将这个新的第 13 列更改为 1.txt 的相应第 1 列?

How can I change this new 13th column to the corresponding 1st column of 1.txt?

推荐答案

awk 方法:

awk 'NR==FNR{ a[$2,$3,$4,$5]=$1; next }
     { s=SUBSEP; k=$7 s $8 s $9 s $10 }k in a{ print $0,a[k] }' 2.txt 1.txt

输出:

chr1 10 20 . . + chr1 30 40 ABC . + 1
chr2 11 22 . . + chr2 90 92 XXX . - 27
chrX 33 42 . . + chrX 70 80 XXX . + 1

这篇关于比较不同文件的多列并在匹配时从文件中附加一列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆