awk来比较两个文件由标识符&放大器;在一个特定的格式输出 [英] awk to compare two file by identifier & output in a specific format
问题描述
我有2个大文件,我需要比较分隔的所有管道
I have 2 large files i need to compare all pipe delimited
文件1
a||d||f||a
1||2||3||4
文件2
a||d||f||a
1||1||3||4
1||2||r||f
现在我要比较的文件和放大器;打印相应例如,如果在文件中发现的2任何更新将打印为 updated_value#OLDVALUE
&安培;任何新行添加到文件2也将随之更新。
Now I want to compare the files & print accordingly such as if any update found in file 2 will be printed as updated_value#oldvalue
& any new line added to file 2 will also be updated accordingly.
因此所需的输出是:(仅更新&放大器;新数据)
So the desired output is: (only the updated & new data)
1||1#2||3||4
1||2||r||f
我所到目前为止已经试过是让分离的变化值:
what I have tried so far is to get the separated changed values:
awk -F '[||]+' 'NR==FNR{for(i=1;i<=NF;i++)a[NR,i]=$i;next}{for(i=1;i<=NF;i++)if(a[FNR,i]!=$i)print $i"#"a[FNR,i]}' file1 file2 >output
不过,我想打印整行。我怎样才能做到这一点?
But I want to print the whole line. How can I achieve that??
推荐答案
我会说:
awk 'BEGIN{FS=OFS="|"}
FNR==NR {for (i=1;i<=NF;i+=2) a[FNR,i]=$i; next}
{for (i=1; i<=NF; i+=2)
if (a[FNR,i] && a[FNR,i]!=$i)
$i=$i"#"a[FNR,i]
}1' f1 f2
此存储文件1在矩阵 A [行数,列]
。然后,它与它的file2中对应的值进行比较。
This stores the file1 in a matrix a[line number, column]
. Then, it compares its values with its correspondence in file2.
请注意我使用的字段分隔符 |
而不是 ||
,并在两个步骤来使用循环正确的数据。这是因为我为例做了的gawk -F'||' {打印NF}'F1
并得到了刚 1
,这意味着 FS
WASN ŧ很好理解。会很感激,如果有人在这里指出错误!
Note I am using the field separator |
instead of ||
and looping in steps of two to use the proper data. This is because I for example did gawk -F'||' '{print NF}' f1
and got just 1
, meaning that FS
wasn't well understood. Will be grateful if someone points the error here!
$ awk 'BEGIN{FS=OFS="|"} FNR==NR {for (i=1;i<=NF;i+=2) a[FNR,i]=$i; next} {for (i=1; i<=NF; i+=2) if (a[FNR,i] && a[FNR,i]!=$i) $i=$i"#"a[FNR,i]}1' f1 f2
a||d||f||b#a
1||1#2||3||4
1||2||r||f
这篇关于awk来比较两个文件由标识符&放大器;在一个特定的格式输出的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!