AWK - 比较每一行找到重复的字段,并添加一些措辞行的末尾 [英] Awk - Compare every line to find a duplicate field, and add some wording to the end of line

查看:171
本文介绍了AWK - 比较每一行找到重复的字段,并添加一些措辞行的末尾的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有这样的文件的文件,我试图验证每行的一个字段,并添加一些措词,如果该字段具有重复前面的文件中。

  \\\\ FILE04 \\ BUET-PCO; \\\\ SERVER24 \\ DFS \\ SHARED \\ CORP \\ ET \\项目管理办公室; / FS7_150a / FILE04 / BU-D /项目管理办公室; 10000bytes ; 9888 ;;;
\\\\FILE12\\BUAG-GOLDMINE$;\\\\SERVER24\\DFS\\SHARED\\CAN\\AGENCY\\GOLDMINE;/FS3_150a/FILE12/BU/AGENCY/GOLDMINE;90000bytes;98834;;;
\\\\FILE12\\BUGB-BUSINTEG$;\\\\SERVER24\\DFS\\SHARED\\CAN\\GB\\BUSINTEG;/FS3_150a/FILE12/BU/GB/BUSINTEG;50000bytes;988822;;;
\\\\FILE12\\BUGB-BUSINTEG$;\\\\SERVER24\\DFS\\SHARED\\CAN\\GB\\BUSINTEG;/FS3_150a/FILE12/BU/GB/BUSINTEG;50000bytes;988822;other东东;;

在我的例子线#3,#4具有相同的物理路径。
我想有一个脚本,可能会对同一文件,例如比较第三场/ FS3_150a / FILE12 / BU / GB / BUSINTEG,
如果它发现完全匹配打印类似作为行#相同的物理路径对于这两种情况,

  \\\\ FILE04 \\ BUET-PCO; \\\\ SERVER24 \\ DFS \\ SHARED \\ CORP \\ ET \\项目管理办公室; / FS7_150a / FILE04 / BU-D /项目管理办公室; 10000bytes ; 9888 ;;;
\\\\FILE12\\BUAG-GOLDMINE$;\\\\SERVER24\\DFS\\SHARED\\CAN\\AGENCY\\GOLDMINE;/FS3_150a/FILE12/BU/AGENCY/GOLDMINE;90000bytes;98834;;;
\\\\FILE12\\BUGB-BUSINTEG$;\\\\SERVER24\\DFS\\SHARED\\CAN\\GB\\BUSINTEG;/FS3_150a/FILE12/BU/GB/BUSINTEG;50000bytes;988822;;;Same物理路径作为4号线
\\\\FILE12\\BUGB-BUSINTEG$;\\\\SERVER24\\DFS\\SHARED\\CAN\\GB\\BUSINTEG;/FS3_150a/FILE12/BU/GB/BUSINTEG;50000bytes;988822;other东东;;相同的物理路径为线#3


解决方案

下面是一个使用 GNU AWK 的一种方式。这是一个有点hackish的,因人而异。像运行:

 的awk -f script.awk file.txt的{,}

目录 script.awk

  BEGIN {
    FS =;
}FNR == {NR
    数组[$ 3] =阵列[$ 3]#NR
    下一个
}{
    如果($ 3阵列和放大器;&安培;数组[$ 3]〜/#.#/){
        副本=阵列[$ 3]
        子(#FNR,拷贝)
        printf的%S相同的物理路径为线以%s \\ n,$ 0副本
    }
    其他{
        打印
    }
}

结果:

  \\\\ FILE04 \\ BUET-PCO; \\\\ SERVER24 \\ DFS \\ SHARED \\ CORP \\ ET \\项目管理办公室; / FS7_150a / FILE04 / BU-D /项目管理办公室; 10000bytes ; 9888 ;;;
\\\\FILE12\\BUAG-GOLDMINE$;\\\\SERVER24\\DFS\\SHARED\\CAN\\AGENCY\\GOLDMINE;/FS3_150a/FILE12/BU/AGENCY/GOLDMINE;90000bytes;98834;;;
\\\\FILE12\\BUGB-BUSINTEG$;\\\\SERVER24\\DFS\\SHARED\\CAN\\GB\\BUSINTEG;/FS3_150a/FILE12/BU/GB/BUSINTEG;50000bytes;988822;;;相同的物理路径为线为#4
\\\\FILE12\\BUGB-BUSINTEG$;\\\\SERVER24\\DFS\\SHARED\\CAN\\GB\\BUSINTEG;/FS3_150a/FILE12/BU/GB/BUSINTEG;50000bytes;988822;other东东;;相同的物理路径为线为#3

I have a file like this file, and I am trying to verify one field of each line, and add some wording if that field has a duplicate earlier in the file.

\\FILE04\BUET-PCO;\\SERVER24\DFS\SHARED\CORP\ET\PROJECT CONTROL OFFICE;/FS7_150a/FILE04/BU-D/PROJECT CONTROL OFFICE;10000bytes;9888;;;
\\FILE12\BUAG-GOLDMINE$;\\SERVER24\DFS\SHARED\CAN\AGENCY\GOLDMINE;/FS3_150a/FILE12/BU/AGENCY/GOLDMINE;90000bytes;98834;;;
\\FILE12\BUGB-BUSINTEG$;\\SERVER24\DFS\SHARED\CAN\GB\BUSINTEG;/FS3_150a/FILE12/BU/GB/BUSINTEG;50000bytes;988822;;;
\\FILE12\BUGB-BUSINTEG$;\\SERVER24\DFS\SHARED\CAN\GB\BUSINTEG;/FS3_150a/FILE12/BU/GB/BUSINTEG;50000bytes;988822;other stuff;;

In my example Line #3 and #4 have the same physical path. I would to have a script that could compare third field for example /FS3_150a/FILE12/BU/GB/BUSINTEG against the same file, and if it found the exact match to print something like "same physical path as Line #" for both cases,

\\FILE04\BUET-PCO;\\SERVER24\DFS\SHARED\CORP\ET\PROJECT CONTROL OFFICE;/FS7_150a/FILE04/BU-D/PROJECT CONTROL OFFICE;10000bytes;9888;;;
\\FILE12\BUAG-GOLDMINE$;\\SERVER24\DFS\SHARED\CAN\AGENCY\GOLDMINE;/FS3_150a/FILE12/BU/AGENCY/GOLDMINE;90000bytes;98834;;;
\\FILE12\BUGB-BUSINTEG$;\\SERVER24\DFS\SHARED\CAN\GB\BUSINTEG;/FS3_150a/FILE12/BU/GB/BUSINTEG;50000bytes;988822;;;Same Physical Path as Line #4
\\FILE12\BUGB-BUSINTEG$;\\SERVER24\DFS\SHARED\CAN\GB\BUSINTEG;/FS3_150a/FILE12/BU/GB/BUSINTEG;50000bytes;988822;other stuff;; Same Physical Path as Line #3

解决方案

Here's one way using GNU awk. It is a little hackish, YMMV. Run like:

awk -f script.awk file.txt{,}

Contents of script.awk:

BEGIN {
    FS = ";"
}

FNR==NR {
    array[$3]=array[$3] "#" NR
    next
}

{
    if ($3 in array && array[$3] ~ /#.#/) {
        copy = array[$3]
        sub("#"FNR, "", copy)
        printf "%s Same Physical Path as Line as %s\n", $0, copy
    }
    else {
        print
    }
}

Results:

\\FILE04\BUET-PCO;\\SERVER24\DFS\SHARED\CORP\ET\PROJECT CONTROL OFFICE;/FS7_150a/FILE04/BU-D/PROJECT CONTROL OFFICE;10000bytes;9888;;;
\\FILE12\BUAG-GOLDMINE$;\\SERVER24\DFS\SHARED\CAN\AGENCY\GOLDMINE;/FS3_150a/FILE12/BU/AGENCY/GOLDMINE;90000bytes;98834;;;
\\FILE12\BUGB-BUSINTEG$;\\SERVER24\DFS\SHARED\CAN\GB\BUSINTEG;/FS3_150a/FILE12/BU/GB/BUSINTEG;50000bytes;988822;;; Same Physical Path as Line as #4
\\FILE12\BUGB-BUSINTEG$;\\SERVER24\DFS\SHARED\CAN\GB\BUSINTEG;/FS3_150a/FILE12/BU/GB/BUSINTEG;50000bytes;988822;other stuff;; Same Physical Path as Line as #3

这篇关于AWK - 比较每一行找到重复的字段,并添加一些措辞行的末尾的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆