AWK - 比较每一行找到重复的字段,并添加一些措辞行的末尾 [英] Awk - Compare every line to find a duplicate field, and add some wording to the end of line
问题描述
我有这样的文件的文件,我试图验证每行的一个字段,并添加一些措词,如果该字段具有重复前面的文件中。
\\\\ FILE04 \\ BUET-PCO; \\\\ SERVER24 \\ DFS \\ SHARED \\ CORP \\ ET \\项目管理办公室; / FS7_150a / FILE04 / BU-D /项目管理办公室; 10000bytes ; 9888 ;;;
\\\\FILE12\\BUAG-GOLDMINE$;\\\\SERVER24\\DFS\\SHARED\\CAN\\AGENCY\\GOLDMINE;/FS3_150a/FILE12/BU/AGENCY/GOLDMINE;90000bytes;98834;;;
\\\\FILE12\\BUGB-BUSINTEG$;\\\\SERVER24\\DFS\\SHARED\\CAN\\GB\\BUSINTEG;/FS3_150a/FILE12/BU/GB/BUSINTEG;50000bytes;988822;;;
\\\\FILE12\\BUGB-BUSINTEG$;\\\\SERVER24\\DFS\\SHARED\\CAN\\GB\\BUSINTEG;/FS3_150a/FILE12/BU/GB/BUSINTEG;50000bytes;988822;other东东;;
在我的例子线#3,#4具有相同的物理路径。
我想有一个脚本,可能会对同一文件,例如比较第三场/ FS3_150a / FILE12 / BU / GB / BUSINTEG,
如果它发现完全匹配打印类似作为行#相同的物理路径对于这两种情况,
\\\\ FILE04 \\ BUET-PCO; \\\\ SERVER24 \\ DFS \\ SHARED \\ CORP \\ ET \\项目管理办公室; / FS7_150a / FILE04 / BU-D /项目管理办公室; 10000bytes ; 9888 ;;;
\\\\FILE12\\BUAG-GOLDMINE$;\\\\SERVER24\\DFS\\SHARED\\CAN\\AGENCY\\GOLDMINE;/FS3_150a/FILE12/BU/AGENCY/GOLDMINE;90000bytes;98834;;;
\\\\FILE12\\BUGB-BUSINTEG$;\\\\SERVER24\\DFS\\SHARED\\CAN\\GB\\BUSINTEG;/FS3_150a/FILE12/BU/GB/BUSINTEG;50000bytes;988822;;;Same物理路径作为4号线
\\\\FILE12\\BUGB-BUSINTEG$;\\\\SERVER24\\DFS\\SHARED\\CAN\\GB\\BUSINTEG;/FS3_150a/FILE12/BU/GB/BUSINTEG;50000bytes;988822;other东东;;相同的物理路径为线#3
下面是一个使用 GNU AWK
的一种方式。这是一个有点hackish的,因人而异。像运行:
的awk -f script.awk file.txt的{,}
目录 script.awk
:
BEGIN {
FS =;
}FNR == {NR
数组[$ 3] =阵列[$ 3]#NR
下一个
}{
如果($ 3阵列和放大器;&安培;数组[$ 3]〜/#.#/){
副本=阵列[$ 3]
子(#FNR,拷贝)
printf的%S相同的物理路径为线以%s \\ n,$ 0副本
}
其他{
打印
}
}
结果:
\\\\ FILE04 \\ BUET-PCO; \\\\ SERVER24 \\ DFS \\ SHARED \\ CORP \\ ET \\项目管理办公室; / FS7_150a / FILE04 / BU-D /项目管理办公室; 10000bytes ; 9888 ;;;
\\\\FILE12\\BUAG-GOLDMINE$;\\\\SERVER24\\DFS\\SHARED\\CAN\\AGENCY\\GOLDMINE;/FS3_150a/FILE12/BU/AGENCY/GOLDMINE;90000bytes;98834;;;
\\\\FILE12\\BUGB-BUSINTEG$;\\\\SERVER24\\DFS\\SHARED\\CAN\\GB\\BUSINTEG;/FS3_150a/FILE12/BU/GB/BUSINTEG;50000bytes;988822;;;相同的物理路径为线为#4
\\\\FILE12\\BUGB-BUSINTEG$;\\\\SERVER24\\DFS\\SHARED\\CAN\\GB\\BUSINTEG;/FS3_150a/FILE12/BU/GB/BUSINTEG;50000bytes;988822;other东东;;相同的物理路径为线为#3
I have a file like this file, and I am trying to verify one field of each line, and add some wording if that field has a duplicate earlier in the file.
\\FILE04\BUET-PCO;\\SERVER24\DFS\SHARED\CORP\ET\PROJECT CONTROL OFFICE;/FS7_150a/FILE04/BU-D/PROJECT CONTROL OFFICE;10000bytes;9888;;;
\\FILE12\BUAG-GOLDMINE$;\\SERVER24\DFS\SHARED\CAN\AGENCY\GOLDMINE;/FS3_150a/FILE12/BU/AGENCY/GOLDMINE;90000bytes;98834;;;
\\FILE12\BUGB-BUSINTEG$;\\SERVER24\DFS\SHARED\CAN\GB\BUSINTEG;/FS3_150a/FILE12/BU/GB/BUSINTEG;50000bytes;988822;;;
\\FILE12\BUGB-BUSINTEG$;\\SERVER24\DFS\SHARED\CAN\GB\BUSINTEG;/FS3_150a/FILE12/BU/GB/BUSINTEG;50000bytes;988822;other stuff;;
In my example Line #3 and #4 have the same physical path. I would to have a script that could compare third field for example /FS3_150a/FILE12/BU/GB/BUSINTEG against the same file, and if it found the exact match to print something like "same physical path as Line #" for both cases,
\\FILE04\BUET-PCO;\\SERVER24\DFS\SHARED\CORP\ET\PROJECT CONTROL OFFICE;/FS7_150a/FILE04/BU-D/PROJECT CONTROL OFFICE;10000bytes;9888;;;
\\FILE12\BUAG-GOLDMINE$;\\SERVER24\DFS\SHARED\CAN\AGENCY\GOLDMINE;/FS3_150a/FILE12/BU/AGENCY/GOLDMINE;90000bytes;98834;;;
\\FILE12\BUGB-BUSINTEG$;\\SERVER24\DFS\SHARED\CAN\GB\BUSINTEG;/FS3_150a/FILE12/BU/GB/BUSINTEG;50000bytes;988822;;;Same Physical Path as Line #4
\\FILE12\BUGB-BUSINTEG$;\\SERVER24\DFS\SHARED\CAN\GB\BUSINTEG;/FS3_150a/FILE12/BU/GB/BUSINTEG;50000bytes;988822;other stuff;; Same Physical Path as Line #3
Here's one way using GNU awk
. It is a little hackish, YMMV. Run like:
awk -f script.awk file.txt{,}
Contents of script.awk
:
BEGIN {
FS = ";"
}
FNR==NR {
array[$3]=array[$3] "#" NR
next
}
{
if ($3 in array && array[$3] ~ /#.#/) {
copy = array[$3]
sub("#"FNR, "", copy)
printf "%s Same Physical Path as Line as %s\n", $0, copy
}
else {
print
}
}
Results:
\\FILE04\BUET-PCO;\\SERVER24\DFS\SHARED\CORP\ET\PROJECT CONTROL OFFICE;/FS7_150a/FILE04/BU-D/PROJECT CONTROL OFFICE;10000bytes;9888;;;
\\FILE12\BUAG-GOLDMINE$;\\SERVER24\DFS\SHARED\CAN\AGENCY\GOLDMINE;/FS3_150a/FILE12/BU/AGENCY/GOLDMINE;90000bytes;98834;;;
\\FILE12\BUGB-BUSINTEG$;\\SERVER24\DFS\SHARED\CAN\GB\BUSINTEG;/FS3_150a/FILE12/BU/GB/BUSINTEG;50000bytes;988822;;; Same Physical Path as Line as #4
\\FILE12\BUGB-BUSINTEG$;\\SERVER24\DFS\SHARED\CAN\GB\BUSINTEG;/FS3_150a/FILE12/BU/GB/BUSINTEG;50000bytes;988822;other stuff;; Same Physical Path as Line as #3
这篇关于AWK - 比较每一行找到重复的字段,并添加一些措辞行的末尾的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!