加入两个文件基于两列 [英] join two files based on two columns

查看:107
本文介绍了加入两个文件基于两列的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

不管你信不信,我已经搜查所有在互联网上,并没有发现在AWK这个问题的一个有效的解决方案。

Believe it or not, I've searched all over the internet and haven't found a working solution for this problem in AWK.

我有两个文件,​​A和B:

I have two files, A and B:

文件:

chr1   pos1   
chr1   pos2
chr2   pos1
chr2   pos2

文件B:

chr1 pos1
chr2 pos1
chr3 pos2

所需的输出:

chr1 pos1
chr2 pos1

我想加入这两个文件基本上得到基于第一和第二列,而不只是第两个文件之间的交叉点。既然是这样的话,最简单的脚本将无法正常工作和联接似乎没有成为一种选择。

I'd like to join these two files to basically get the intersection between the two files based on the first AND second columns, not just the first. Since this is the case, most simple scripts won't work and join doesn't seem to be an option.

任何想法?

编辑:对不起,我没有提到有更多的列不仅仅是两个我发现。我只在我的例子显示了两个,因为我只是在这两个文件是相同的前两列感兴趣,该数据的其余部分并不重要(但仍然在文件中)

sorry, I didn't mention that there are more columns than just the two I showed. I've only shown two in my example because I'm only interested in the first two columns between both files being identical, the rest of the data aren't important (but are nonetheless in the file)

推荐答案

呜呜,我的想法是这样的:
使用加入来这两个文件和合并正确使用awk

Hum, my idea is the following: Use join to merge the two files and correct with awk

$ join  A B 
chr1 pos1 pos1
chr1 pos2 pos1
chr2 pos1 pos1
chr2 pos2 pos1

$ join  A B | awk '{ if ($2==$3) printf("%s %s\n", $1, $2) }'
chr1 pos1 pos1
chr2 pos1 pos1

编辑:给编辑,连接解决方​​案可能仍然工作(含期权),所以这个概念仍然是正确的(IMO)

given the edit, the join solution may still work (with options), so the concept remains correct (imo).

这篇关于加入两个文件基于两列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆