通过AWK连接具有多列的两个文件 [英] Joining two files with multiple columns via AWK
问题描述
首先,我必须道歉:我知道已经有很多主题可以回答我的问题,但是正如您自己所看到的那样,AWK并不是我的好朋友.
First of all, I must apologise : I know there's a lot of various topics that already answer my question, but as you'll see by yourself, AWK isn't really a big friend of mine.
你们都知道这个故事,对吧? ;)嘿,随意的员工,您是被选中的人!我需要您学习我们都不知道的奇怪的事情.您的截止日期是明天,祝您好运!"
You all know the story, right ? ;) "Hey random employee, you are the chosen one ! I need you to learn this strange thing that none of us know. Your deadline is tomorrow, good luck !"
我不会再抱怨它了(承诺!:p),但是经过多次尝试,我无法真正理解AWK的所有内容(谁说过一件事情"?).
I won't complain about it anymore (promise ! :p), but after many tries, I can't really understand everything (who said "a single thing" ?) about AWK.
所以,这是我的问题!
我有两个文件,其以下各列:
I have two files, with the following columns :
文件A.txt:
A B C D E F G H
文件B.txt:
A C F I
我想通过将这两个文件合并到另一个文件中来获得以下输出:
I want to get the following output by joining these two files in another one :
输出文件C.txt:
A B C D E F G H I
我想在它们之间建立连接,将"I"添加到具有A,C和F列的行中,并删除其他行.
I would like to make a join between them, adding "I" to already existent lines with columns A, C and F, and removing the other ones.
到目前为止,我知道我必须使用类似这样的东西:
So far, I know that I must use something like this :
awk '
FNR==NR{Something ;next}
{print $0}
' A.txt B.txt
是的,我知道.一开始听起来很糟糕.
Yeah, I know. Sounds pretty bad for a start.
任何英雄,在那儿吗?
推荐答案
awk '
NR==FNR {A[$1,$3,$6] = $0; next}
($1 SUBSEP $2 SUBSEP $3) in A {print A[$1,$2,$3], $4}
' A.txt B.txt
这要求将整个文件A.txt存储在内存中.如果B.txt小得多
That requires the whole file A.txt to be stored in memory. If B.txt is significantly smaller
awk '
NR==FNR {B[$1,$2,$3] = $4; next}
($1 SUBSEP $3 SUBSEP $6) in B {print $0, B[$1,$3,$6]}
' B.txt A.txt
这篇关于通过AWK连接具有多列的两个文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!