如何比较2个文件具有非连续顺序的随机数? [英] How to compare 2 files having random numbers in non sequential order?

查看:130
本文介绍了如何比较2个文件具有非连续顺序的随机数?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

有2个文件名为compare 1.txt和compare2.txt,它们具有按非顺序顺序的随机数

There are 2 files named compare 1.txt and compare2.txt having random numbers in non-sequential order

cat compare1.txt

cat compare1.txt

57
11
13
3
889
014
91

cat compare2.txt

cat compare2.txt

003
889
13
14
57
12
90

目标

  1. 比较1中存在但不包含在比较2中的所有数字的输出列表,反之亦然

如果任何数字的前缀为零,则在比较时忽略零(基本上,数字的绝对值必须不同才能被视为不匹配) 示例-应将3视为与003匹配,将014视为与14、008与8匹配,以此类推

If any number has zero in its prefix, ignore zeros while comparing ( basically the absolute value of number must be different to be treated as a mismatch ) Example - 3 should be considered matching with 003 and 014 should be considered matching with 14, 008 with 8 etc

注意-不一定必须在同一行上进行匹配. 即使在compare2的第一行以外的其他数字也存在,compare1的第一行中的数字也应视为匹配的数字.

Note - It is not necessary that matching must necessarily happen on the same line. A number present in the first line in compare1 should be considered matched even if that same number is present on other than the first line in compare2

预期产量

90
91
12
11

PS(我在预期的输出中不一定需要这个确切的顺序,只要这4个数字以任何顺序都可以)

PS ( I don't necessarily need this exact order in expected output, just these 4 numbers in any order would do )

我尝试了什么?

很明显,我不希望正确设置第二个条件,我只尝试满足第一个条件,但无法获得正确的结果. 我已经尝试过这些命令

Obviously I didn't have hopes of getting the second condition correct, I tried only fulfilling the first condition but couldn't get correct results. I had tried these commands

grep -Fxv -f compare1.txt compare2.txt && grep -Fxv -f compare2.txt compare1.txt

cat compare1.txt compare2.txt | sort |uniq

编辑-也可以使用Python解决方案

Edit - A Python solution is also fine

推荐答案

能否请您按照GNU awk中显示的示例进行尝试,编写和测试.

Could you please try following, written and tested with shown samples in GNU awk.

awk '
{
  $0=$0+0
}
FNR==NR{
  a[$0]
  next
}
($0 in a){
  b[$0]
  next
}
{ print }
END{
  for(j in a){
    if(!(j in b)){ print j }
  }
}
'  compare1.txt compare2.txt

说明: 添加以上详细说明.

Explanation: Adding detailed explanation for above.

awk '                                ##Starting awk program from here.
{
  $0=$0+0                            ##Adding 0 will remove extra zeros from current line,considering that your file doesn't have float values.
}
FNR==NR{                             ##Checking condition FNR==NR which will be TRUE when 1st Input_file is being read.
  a[$0]                              ##Creating array a with index of current line here.
  next                               ##next will skip all further statements from here.
}
($0 in a){                           ##Checking condition if current line is present in a then do following.
  b[$0]                              ##Creating array b with index of current line.
  next                               ##next will skip all further statements from here.
}
{ print }                                   ##will print current line from 2nd Input_file here.
END{                                 ##Starting END block of this code from here.
  for(j in a){                       ##Traversing through array a here.
    if(!(j in b)){ print j }         ##Checking condition if current index value is NOT present in b then print that index.
  }
}
'  compare1.txt compare2.txt         ##Mentioning Input_file names here.

这篇关于如何比较2个文件具有非连续顺序的随机数?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆