在Linux中使用AWK合并两个文件 [英] Merge two files using awk in linux

查看:630
本文介绍了在Linux中使用AWK合并两个文件的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个1.txt文件:

betomak@msn.com||o||0174686211||o||7880291304ca0404f4dac3dc205f1adf||o||Mario||o||Mario||o||Kawati
zizipi@libero.it||o||174732943.0174732943||o||e10adc3949ba59abbe56e057f20f883e||o||Tiziano||o||Tiziano||o||D'Intino
frankmel@hotmail.de||o||0174844404||o||8d496ce08a7ecef4721973cb9f777307||o||Melanie||o||Melanie||o||Kiesel
apoka-paris@hotmail.fr||o||0174847613||o||536c1287d2dc086030497d1b8ea7a175||o||Sihem||o||Sihem||o||Sousou
sofianomovic@msn.fr||o||174902297.0174902297||o||9893ac33a018e8d37e68c66cae23040e||o||Nabile||o||Nabile||o||Nassime
donaldduck@yahoo.com||o||174912161.0174912161||o||0c770713436695c18a7939ad82bc8351||o||Donald||o||Donald||o||Duck
cernakova@centrum.cz||o||0174991962||o||d161dc716be5daf1649472ddf9e343e6||o||Dagmar||o||Dagmar||o||Cernakova
trgsrl@tiscali.it||o||0175099675||o||d26005df3e5b416d6a39cc5bcfdef42b||o||Esmeralda||o||Esmeralda||o||Trogu
catherinesou@yahoo.fr||o||0175128896||o||2e9ce84389c3e2c003fd42bae3c49d12||o||Cat||o||Cat||o||Sou
ermimurati24@hotmail.com||o||0175228687||o||a7766a502e4f598c9ddb3a821bc02159||o||Anna||o||Anna||o||Beratsja
cece_89@live.fr||o||0175306898||o||297642a68e4e0b79fca312ac072a9d41||o||Celine||o||Celine||o||Jacinto
kendinegel39@hotmail.com||o||0175410459||o||a6565ca2bc8887cde5e0a9819d9a8ee9||o||Adem||o||Adem||o||Bulut

2.txt文件:

9893ac33a018e8d37e68c66cae23040e:134:@a1
536c1287d2dc086030497d1b8ea7a175:~~@!:/92\
8d496ce08a7ecef4721973cb9f777307:demodemo

1.txt的FS是"|| o ||"并且对于2.txt是:" 我要基于1.txt的第三列必须与2.txt文件的第一列匹配并且应由2.txt文件的第二列替换的条件,将两个文件合并到单个文件result.txt中. /p>

预期输出将包含所有匹配的行: 我向您展示其中之一:

sofianomovic@msn.fr||o||174902297.0174902297||o||134:@a1||o||Nabile||o||Nabile||o||Nassime

我尝试了脚本:

awk -F"||o||"  'NR==FNR{s=$0; sub(/:[^:]*$/, "", s); a[s]=$NF;next} {s = $5; for (i=6; i<=NF; ++i) s = s "," $i; if (s in a) { NF = 5; $5=a[s]; print } }' FS=: <(tr -d '\r' < 2.txt) FS="||o||" OFS="||o||" <(tr -d '\r' < 1.txt) > result.txt

但是得到一个空文件.任何帮助将不胜感激.

解决方案

如果您的实际输入文件与所示示例相同,则遵循awk可能会帮助您.

awk -v s1="||o||" '
FNR==NR{
  a[$9]=$1 s1 $5;
  b[$9]=$13 s1 $17 s1 $21;
  next
}
($1 in a){
  print a[$1] s1 $2 FS $3 s1 b[$1]
}
' FS="|" 1.txt FS=":" 2.txt

:由于OP有所更改,因此请按照新的要求提供代码,以询问它将在哪里创建2个文件,也可能在1个文件中创建ID,而ID中包含1个文件. txt和2.txt中的NOT和其他将相反.

awk -v s1="||o||" '
FNR==NR{
  a[$9]=$1 s1 $5;
  b[$9]=$13 s1 $17 s1 $21;
  c[$9]=$0;
  next
}
($1 in a){
  val=$1;
  $1="";
  sub(/:/,"");
  print a[val] s1 $0 s1 b[val];
  d[val]=$0;
  next
}
{
  print > "NOT_present_in_2.txt"
}
END{
for(i in d){
  delete c[i]
};
for(j in c){
  print j,c[j] > "NOT_present_in_1.txt"
}}
' FS="|" 1.txt FS=":" OFS=":" 2.txt

I have a 1.txt file:

betomak@msn.com||o||0174686211||o||7880291304ca0404f4dac3dc205f1adf||o||Mario||o||Mario||o||Kawati
zizipi@libero.it||o||174732943.0174732943||o||e10adc3949ba59abbe56e057f20f883e||o||Tiziano||o||Tiziano||o||D'Intino
frankmel@hotmail.de||o||0174844404||o||8d496ce08a7ecef4721973cb9f777307||o||Melanie||o||Melanie||o||Kiesel
apoka-paris@hotmail.fr||o||0174847613||o||536c1287d2dc086030497d1b8ea7a175||o||Sihem||o||Sihem||o||Sousou
sofianomovic@msn.fr||o||174902297.0174902297||o||9893ac33a018e8d37e68c66cae23040e||o||Nabile||o||Nabile||o||Nassime
donaldduck@yahoo.com||o||174912161.0174912161||o||0c770713436695c18a7939ad82bc8351||o||Donald||o||Donald||o||Duck
cernakova@centrum.cz||o||0174991962||o||d161dc716be5daf1649472ddf9e343e6||o||Dagmar||o||Dagmar||o||Cernakova
trgsrl@tiscali.it||o||0175099675||o||d26005df3e5b416d6a39cc5bcfdef42b||o||Esmeralda||o||Esmeralda||o||Trogu
catherinesou@yahoo.fr||o||0175128896||o||2e9ce84389c3e2c003fd42bae3c49d12||o||Cat||o||Cat||o||Sou
ermimurati24@hotmail.com||o||0175228687||o||a7766a502e4f598c9ddb3a821bc02159||o||Anna||o||Anna||o||Beratsja
cece_89@live.fr||o||0175306898||o||297642a68e4e0b79fca312ac072a9d41||o||Celine||o||Celine||o||Jacinto
kendinegel39@hotmail.com||o||0175410459||o||a6565ca2bc8887cde5e0a9819d9a8ee9||o||Adem||o||Adem||o||Bulut

A 2.txt file:

9893ac33a018e8d37e68c66cae23040e:134:@a1
536c1287d2dc086030497d1b8ea7a175:~~@!:/92\
8d496ce08a7ecef4721973cb9f777307:demodemo

FS for 1.txt is "||o||" and for 2.txt is ":" I want to merge two files in a single file result.txt based on the condition that the 3rd column of 1.txt must match with 1st column of 2.txt file and should be replaced by the 2nd column of 2.txt file.

The expected output will contain all the matching lines: I am showing you one of them:

sofianomovic@msn.fr||o||174902297.0174902297||o||134:@a1||o||Nabile||o||Nabile||o||Nassime

I tried the script:

awk -F"||o||"  'NR==FNR{s=$0; sub(/:[^:]*$/, "", s); a[s]=$NF;next} {s = $5; for (i=6; i<=NF; ++i) s = s "," $i; if (s in a) { NF = 5; $5=a[s]; print } }' FS=: <(tr -d '\r' < 2.txt) FS="||o||" OFS="||o||" <(tr -d '\r' < 1.txt) > result.txt

But getting an empty file as the result. Any help would be highly appreciated.

解决方案

If your actual Input_file(s) are same as shown sample then following awk may help you in same.

awk -v s1="||o||" '
FNR==NR{
  a[$9]=$1 s1 $5;
  b[$9]=$13 s1 $17 s1 $21;
  next
}
($1 in a){
  print a[$1] s1 $2 FS $3 s1 b[$1]
}
' FS="|" 1.txt FS=":" 2.txt

EDIT: Since OP has changed requirement a bit so providing code as per new ask where it will create 2 files too 1 file which will have ids present in 1.txt and NOT in 2.txt and other will be vice versa of it.

awk -v s1="||o||" '
FNR==NR{
  a[$9]=$1 s1 $5;
  b[$9]=$13 s1 $17 s1 $21;
  c[$9]=$0;
  next
}
($1 in a){
  val=$1;
  $1="";
  sub(/:/,"");
  print a[val] s1 $0 s1 b[val];
  d[val]=$0;
  next
}
{
  print > "NOT_present_in_2.txt"
}
END{
for(i in d){
  delete c[i]
};
for(j in c){
  print j,c[j] > "NOT_present_in_1.txt"
}}
' FS="|" 1.txt FS=":" OFS=":" 2.txt

这篇关于在Linux中使用AWK合并两个文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆