通过AWK连接具有多列的两个文件 [英] Joining two files with multiple columns via AWK

查看:323
本文介绍了通过AWK连接具有多列的两个文件的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

首先,我必须道歉:我知道已经有很多主题可以回答我的问题,但是正如您自己所看到的那样,AWK并不是我的好朋友.

First of all, I must apologise : I know there's a lot of various topics that already answer my question, but as you'll see by yourself, AWK isn't really a big friend of mine.

你们都知道这个故事,对吧? ;)嘿,随意的员工,您是被选中的人!我需要您学习我们都不知道的奇怪的事情.您的截止日期是明天,祝您好运!"

You all know the story, right ? ;) "Hey random employee, you are the chosen one ! I need you to learn this strange thing that none of us know. Your deadline is tomorrow, good luck !"

我不会再抱怨它了(承诺!:p),但是经过多次尝试,我无法真正理解AWK的所有内容(谁说过一件事情"?).

I won't complain about it anymore (promise ! :p), but after many tries, I can't really understand everything (who said "a single thing" ?) about AWK.

所以,这是我的问题!

我有两个文件,其以下各列:

I have two files, with the following columns :

文件A.txt:

A B C D E F G H

文件B.txt:

A C F I

我想通过将这两个文件合并到另一个文件中来获得以下输出:

I want to get the following output by joining these two files in another one :

输出文件C.txt:

A B C D E F G H I

我想在它们之间建立连接,将"I"添加到具有A,C和F列的行中,并删除其他行.

I would like to make a join between them, adding "I" to already existent lines with columns A, C and F, and removing the other ones.

到目前为止,我知道我必须使用类似这样的东西:

So far, I know that I must use something like this :

awk '
    FNR==NR{Something ;next}
    {print $0}
' A.txt B.txt

是的,我知道.一开始听起来很糟糕.

Yeah, I know. Sounds pretty bad for a start.

任何英雄,在那儿吗?

推荐答案

awk '
    NR==FNR {A[$1,$3,$6] = $0; next} 
    ($1 SUBSEP $2 SUBSEP $3) in A {print A[$1,$2,$3], $4}
' A.txt B.txt

这要求将整个文件A.txt存储在内存中.如果B.txt小得多

That requires the whole file A.txt to be stored in memory. If B.txt is significantly smaller

awk '
    NR==FNR {B[$1,$2,$3] = $4; next}
    ($1 SUBSEP $3 SUBSEP $6) in B {print $0, B[$1,$3,$6]}
' B.txt A.txt

这篇关于通过AWK连接具有多列的两个文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆