使用AWK合并基于多个列的两个文件 [英] Using AWK to merge two files based on multiple columns
问题描述
我有两个CSV文件,以; (分号)
作为分隔符,我
需要基于
上的三个列合并,每个文件使用AWK。键列
不连续。想法是从文件B中获得
两列,并在文件A中的所有其他列之后打印它们
。
I have two CSV files, with ; (semicolon) as a separator, that I need to merge based on three columns on each file using AWK. The key columns are not consecutive. Idea is to get two columns from file B and print them after all the other columns from file A.
文件A(键位于A1,A3和A5):
File A (keys are in A1, A3, and A5):
A1;A2;A3;A4;A5
K1;D1;K2;D2;K3
K4;D3;K5;D4;K6
K7;D5;K8;D6;K9
K1;D7;K2;D8;K3
文件B(B1,B2,B4中的键):
File B (keys in B1, B2, B4):
B1;B2;B3;B4;B5
K1;K2;D9;K3;D0
K4;K5;DA;K6;DB
KA;KB;DC;KC;DD
会产生:
A1;A2;A3;A4;A5;;
K1;D1;K2;D2;K3;D9;D0
K4;D3;K5;D4;K6;DA;DB
K7;D5;K8;D6;K9;;
K1;D7;K2;D8;K3;D9;D0
我发现了几个SO中的示例(例如如何使用awk 和如何使用AWK合并两个文件?)和其他地方,但我无法将它们转换为我的需求,因为对它们的记录还不够好,以至于像我这样的AWK n00b都会真正理解它们工作。
I have found several examples here in SO (for example How to merge two files based on the first three columns using awk and How to merge two files using AWK?) and elsewhere but I haven't been able to convert them to my needs, as they haven't been documented so well that an AWK n00b like myself would really understand how they work.
我最近得到的是:
awk -F \; -v OFS=\; 'FNR==NR{c[$1]=$3 FS $5;next}{ print $0, c[$1]}' B A
但是它仍然从输出行1和4中省略了一个分号或一列:
But it still leaves out one semicolon--or a column--from output lines 1 and 4:
A1;A2;A3;A4;A5;
K1;D1;K2;D2;K3;D9;D0
K4;D3;K5;D4;K6;DA;DB
K7;D5;K8;D6;K9;
K1;D7;K2;D8;K3;D9;D0
我该怎么办说明我要用于比较的列?显然,现在只使用第一列进行比较。
An how do I state which columns I want to use for comparing? Apparently now it's only using first column for comparing.
推荐答案
这将打印出来,而没有多余的;
在不匹配的行上。您必须先提供B文件。
This will print without the extra ;
on unmatched lines. You have to provide B file first.
awk 'BEGIN {
OFS=FS=";"
}
FNR==NR {
key[$1 FS $2 FS $4]=$3 OFS $5
}
FNR!=NR {
c=$1 FS $3 FS $5;
if(c in key)
print $0,key[c];
else
print
}' fileB fileA
多余的分隔符,将最后的 print
更改为 print $ 0 OFS OFS
if you need the extra delimiters, change the last print
to print $0 OFS OFS
这篇关于使用AWK合并基于多个列的两个文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!