AWK - 2文件组合的数据,并打印到第3文件,如果匹配的钥匙 [英] Awk - combine the data from 2 files and print to 3rd file if keys matched

查看:138
本文介绍了AWK - 2文件组合的数据,并打印到第3文件,如果匹配的钥匙的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

情景:


  • 我试图写一个awk脚本。

  • 我有两个文件。文件1(制表符分隔),文件2(字符串)。

  • 在文件1,我有01号线的字段4 +字段3 +字段2的组合使
    参考关键file2中字符串的字段1。

  • 我能够匹配并提取信息,但不是一个好的格式

要求


  • 我要打印的信息,File3.txt如果引用键匹配。我需要打印到FILE3其中的情况下,格式

  • 关键字相匹配:01号线从-07文件1后跟匹配的字符串
    从文件2 77的preFIX等。

  • KEY NOT MATCHED:如果密钥不匹配,然后把所有的
    从文件2匹配的记录只与99 preFIX。

脚本:

 的awk -F'\\ t'-v OFS'\\ T''== FNR {NR一个[SUBSTR($ 0,1,8)] = $ 4 $ 3 $ 2}
     {如果($ 4 $ 3 $ 2一个)的printf(77\\ t的);其他的printf(99,\\ t的);打印$ 0}'\\
     FILE2.TXT FILE1.TXT> File3.txt

文件1:

  01 89 68 5000
02 89 11
03 89 00
06 89 00
07 89 19 0428 RT
01 87 23 5100
02 87 11
04 87 9 02
03 87 00
06 87 00
07 87 11 0428 RT
01 83 23 4900
02 83 11
04 83 9 02
03 83 00
06 83 00
07 83 11 0428 RT

文件2:

  50006889 CCARD / 3010 / E / C A87545457 / // /// /// 1151002387 CCARD / 3000 / E / S N054896334IV / // /// /// 1151002390800666 CCARD / 3000 / E / S N0978898IV / // /// /// 11

文件3: 电流输出

  99 50006889 CCARD / 3010 / E / C A87545457 / // /// /// 11
99
99 51002387 CCARD / 3000 / E / S N054896334IV / // /// /// 11
99
99 51002390800666 CCARD / 3000 / E / S N0978898IV / // /// /// 11
77 01 89 68 5000
99 02 89 11
99 03 89 00
99 06 89 00
99 07 89 19 RT
77 01 87 23 5100
99 02 87 11
99 04 87 02 9
99 03 87 00
99 06 87 00
99 07 87 11 0428 RT
99 01 83 23 4900
99 83 11
99 83 9 02
99 83 00
99 83 00
99 83 11 0428 RT

所需的输出:

  01 89 68 5000
02 89 11
03 89 00
06 89 00
07 89 19 0428 RT
77 50006889 CCARD / 3010 / E / C A87545457 / // /// /// 1101 87 23 5100
02 87 11
04 87 9 02
03 87 00
06 87 00
07 87 11 0428 RT
77 51002387 CCARD / 3000 / E / S N054896334IV / // /// /// 1199 51002390800666 CCARD / 3000 / E / S N0978898IV / // /// /// 11

实际上为77和99行,我需要仅在匹配密钥的开始整个字符串,但77和99。目前,如果弦长,进行到第二线,该脚本是把77和99中的第二行的前面,以及
我希望把77和99在匹配$ C $只有c的前面。

例如下面是出由乔纳森把纠正AWK code的:

$ AWK -f awk.script FILE2.TXT FILE1.TXT

  01 89 68 5000
        02 89 11
        03 89 00
        06 89 00
        07 89 19 0428 RT
        77 50006889 CCARD / 3010 / E / C A87545457 / // /// 11 //// 3010 / E / C A87545457 / // /// 11 //// 3010 / E / C A87545457 / // /// 11 //// 3010 / E / C A87545457 / // /// /// 11
        01 87 23 5100
        02 87 11
        04 87 9 02
        03 87 00
        06 87 00
        07 87 11 0428 RT
        77 51002387 CCARD / 3000 / E / S N054896334IV / // /// 11 //// 3010 / E / C A87545457 / // /// 11 //// 3010 / E / C A87545457 / // /// 11 //// 3010 / E / C A87545457 / // /// 11 //// 3010 / E / C A87545457 / // /// 11 //// 3010 / E / C A87545457 / // // / 11 //// 3010 / E / C A87545457 / // /// 11 //// 3010 / E / C A87545457 / // /// 11 //// 3010 / E / C A87545457 / // / // 11 //// 3010 / E / C A87545457 / // /// 11 //// 3010 / E / C A87545457 / // /// 11 //// 3010 / E / C A87545457 / // /// 11 //// 3010 / E / C A87545457 / // /// 11 //// 3010 / E / C A87545457 / // /// 11 //// 3010 / E / C A87545457 / / / /// 11 //// 3010 / E / C A87545457 / // /// 11 //// 3010 / E / C A87545457 / //
77 /// 11 //// 3010 / E / C A87545457 / // /// 11 //// 3010 / E / C A87545457 / // /// 11 //// 3010 / E / C A87545457 / // /// /// 11
        01 83 23 4900
        02 83 11
        04 83 9 02
        03 83 00
        06 83 00
        07 83 11 0428 RT
        99 51002390800666 CCARD / 3000 / E / S N0978898IV / // /// 11 ////小号N0978898IV / // /// 11 ////小号N0978898IV / // /// 11 ////小号N0978898IV / // /// 11 ////小号N0978898IV / // /// 11 ////小号N0978898IV / // /// 11 ////小号N0978898IV / // /// 11 ////小号N0978898IV / // /// 11 ///////小号N0978898IV / // /// 11 ////小号N0978898IV / // /// 11 ////小号N0978898IV / // /// 11 ////小号N0978898IV / // /// 11 ///////小号N0978898IV / // /// 11 ////小号N0978898IV / // /// 11 ////小号N0978898IV / / / /// 11 ////小号N0978898IV / // /// 11 ///////小号N0978898IV
    99 / // /// 11 ////小号N0978898IV / // /// 11 ////小号N0978898IV / // /// 11 ////小号N0978898IV / // /// /// 11 ////小号N0978898IV / // /// 11 ////小号N0978898IV / // /// 11 ////小号N0978898IV / // /// 11 ////小号N0978898IV / // // / 11 ///////小号N0978898IV / // /// 11 ////小号N0978898IV / // /// 11 ////小号N0978898IV / // /// 11 ////小号N0978898IV / // /// 11 ///////小号N0978898IV / // /// 11 ////小号N0978898IV / // /// 11 ////小号N0978898IV / // /// 11 / ///小号N0978898IV / // /// 11 ///////小号N0978898IV / // /// 11 ////小号N09
    99 78898IV / // /// 11 ////小号N0978898IV / // /// 11 ////小号N0978898IV / // /// /// 11
        $


解决方案

您正确读 FILE2.TXT 阅读前 FILE1.TXT 。您需要忽略 FILE2.TXT 空白行,虽然。

  FNR == NR和放大器;&安培; ! / ^ [[:空间:]] * $ / {键= SUBSTR($ 1,1,8);一个[关键] = $ 0;下一个 }

此使用第一个字段的前8个字符为关键,和整个线路作为值。在接下来确保该行不会以其他方式处理。

接下来的部分是繁琐。您需要在现货 $ 1 01 行,并搭建起一个关键。当您下一次获得 01 行,你需要从打印出77- prefixed行 A (并删除该条目 A )。

最后,你需要从 A (并删除该条目 A )。然后,你需要在处理留下了一个任何条目,并给他们99 - preFIX。

  $ 1 ==01{如果(code!= 0)
             {
                 如果(在code)
                 {
                     的printf(77 \\ t%S \\ n,一个[code])
                     删除[code]
                 }
             }
             code = $ 4 $ 3 $ 2
           }
{}打印
结束 {
         如果(在code)
         {
             的printf(77 \\ t%S \\ n,一个[code])
             删除[code]
         }
         对(在code)
             的printf(99 \\ t%S \\ n,一个[code])
    }

显然,你可以用更少的白色空间比我只是做了,尽管你可能需要添加一些分号了。出于测试目的,我把code以上到文件 awk.script 就跑:

  $ AWK -f awk.script FILE2.TXT FILE1.TXT
01 89 68 5000
02 89 11
03 89 00
06 89 00
07 89 19 0428 RT
77 50006889 CCARD / 3010 / E / C A87545457 / // /// /// 11
01 87 23 5100
02 87 11
04 87 9 02
03 87 00
06 87 00
07 87 11 0428 RT
77 51002387 CCARD / 3000 / E / S N054896334IV / // /// /// 11
01 83 23 4900
02 83 11
04 83 9 02
03 83 00
06 83 00
07 83 11 0428 RT
99 51002390800666 CCARD / 3000 / E / S N0978898IV / // /// /// 11
$

这看起来相当类似于你想要的东西。如果你想输出previous块后的空白行中,添加的printf(\\ n)如果打印77-在prefixed线块。你可以把它写入 File3.txt 如果您使用I / O重定向喜欢。您可以嵌入在单引号脚本,它在 -f awk.script 添加到命令行。您可以压扁整个脚本到一个堆积如山行,如果你想,太 - 但请不要;它太大了,使一个很好的内胆,并将该软件的名称是 AWK ,不是的 APL

Scenario:

  • I am trying to write an Awk script.
  • I have two files. File1 (Tab Delimited), File2 (Strings).
  • In File1, I have a combination of Field4+Field3+Field2 of the Line 01 to make a reference key to the Field1 of the strings in File2.
  • I am able to match and extract the information but not a good format

Requirement

  • I want to print the information to File3.txt if the reference key is matched. I need to Print to File3 where the format in case of
  • KEY MATCHED: Line 01 -07 from File1 followed by the matching string from File2 with a prefix of 77 and so on.
  • KEY NOT MATCHED: If the keys are not matched then put all the unmatched records from File2 only with a prefix of 99.

Script:

awk -F'\t' -v OFS'\t' 'FNR==NR{a[substr($0,1,8)]=$4$3$2}
     {if ($4$3$2 in a) printf ("77""\t"); else printf ("99""\t");print $0}' \
     File2.txt File1.txt > File3.txt

File1 :

01  89  68  5000
02  89  11
03  89  00
06  89  00
07  89  19  RT  0428
01  87  23  5100
02  87  11
04  87  9   02
03  87  00
06  87  00
07  87  11  RT  0428
01  83  23  4900
02  83  11
04  83  9   02
03  83  00
06  83  00
07  83  11  RT  0428

File2:

50006889 CCARD /3010  /E     /C A87545457          /  //                ///11        ///

51002387 CCARD /3000  /E     /S N054896334IV          /  //                ///11        ///

51002390800666 CCARD /3000  /E     /S N0978898IV          /  //                ///11        ///

File 3: Current Output

99  50006889 CCARD /3010  /E     /C A87545457          /  //                ///11        ///
99
99  51002387 CCARD /3000  /E     /S N054896334IV          /  //                ///11        ///
99
99  51002390800666 CCARD /3000  /E     /S N0978898IV          /  //                ///11        ///
77  01  89  68  5000
99  02  89  11
99  03  89  00
99  06  89  00
99  07  89  19  RT
77  01  87  23  5100
99  02  87  11
99  04  87  9   02
99  03  87  00
99  06  87  00
99  07  87  11  RT  0428    
99  01  83  23  4900
99  83  11
99  83  9   02
99  83  00
99  83  00
99  83  11  RT  0428

Desired Output:

01  89  68  5000
02  89  11
03  89  00
06  89  00
07  89  19  RT  0428
77  50006889 CCARD /3010  /E     /C A87545457          /  //                ///11        ///

01  87  23  5100
02  87  11
04  87  9   02
03  87  00
06  87  00
07  87  11  RT  0428
77  51002387 CCARD /3000  /E     /S N054896334IV          /  //                ///11        ///

99  51002390800666 CCARD /3000  /E     /S N0978898IV          /  //                ///11        ///

Actually for 77 and 99 lines I need the whole string but 77 and 99 only in the beginning of the matched key. Currently, if string is long and carried on to the 2nd line, the script is putting 77 and 99 in front of the second line as well I am looking to put 77 and 99 in front of the matched code only.

For example following is the out put of the corrected awk code by Jonathan:

$ awk -f awk.script File2.txt File1.txt

        01  89  68  5000
        02  89  11
        03  89  00
        06  89  00
        07  89  19  RT  0428
        77  50006889 CCARD /3010  /E     /C A87545457          /  //                ///11        ////3010  /E     /C A87545457          /  //                ///11        ////3010  /E     /C A87545457          /  //                ///11        ////3010  /E     /C A87545457          /  //                ///11        ///
        01  87  23  5100
        02  87  11
        04  87  9   02
        03  87  00
        06  87  00
        07  87  11  RT  0428
        77  51002387 CCARD /3000  /E     /S N054896334IV          /  //                ///11        ////3010  /E     /C A87545457          /  //                ///11        ////3010  /E     /C A87545457          /  //                ///11        ////3010  /E     /C A87545457          /  //                ///11        ////3010  /E     /C A87545457          /  //                ///11        ////3010  /E     /C A87545457          /  //                ///11        ////3010  /E     /C A87545457          /  //                ///11        ////3010  /E     /C A87545457          /  //                ///11        ////3010  /E     /C A87545457          /  //                ///11        ////3010  /E     /C A87545457          /  //                ///11        ////3010  /E     /C A87545457          /  //                ///11        ////3010  /E     /C A87545457          /  //                ///11        ////3010  /E     /C A87545457          /  //                ///11        ////3010  /E     /C A87545457          /  //                ///11        ////3010  /E     /C A87545457          /  //                ///11        ////3010  /E     /C A87545457          /  //                ///11        ////3010  /E     /C A87545457          /  //      
77          ///11        ////3010  /E     /C A87545457          /  //                ///11        ////3010  /E     /C A87545457          /  //                ///11        ////3010  /E     /C A87545457          /  //                ///11        ///
        01  83  23  4900
        02  83  11
        04  83  9   02
        03  83  00
        06  83  00
        07  83  11  RT  0428
        99  51002390800666 CCARD /3000  /E     /S N0978898IV          /  //                ///11        ////S N0978898IV          /  //                ///11        ////S N0978898IV          /  //                ///11        ////S N0978898IV          /  //                ///11        ////S N0978898IV          /  //                ///11        ////S N0978898IV          /  //                ///11        ////S N0978898IV          /  //                ///11        ////S N0978898IV          /  //                ///11        ///////S N0978898IV          /  //                ///11        ////S N0978898IV          /  //                ///11        ////S N0978898IV          /  //                ///11        ////S N0978898IV          /  //                ///11        ///////S N0978898IV          /  //                ///11        ////S N0978898IV          /  //                ///11        ////S N0978898IV          /  //                ///11        ////S N0978898IV          /  //                ///11        ///////S N0978898IV    
    99        /  //                ///11        ////S N0978898IV          /  //                ///11        ////S N0978898IV          /  //                ///11        ////S N0978898IV          /  //                ///11        ///////S N0978898IV          /  //                ///11        ////S N0978898IV          /  //                ///11        ////S N0978898IV          /  //                ///11        ////S N0978898IV          /  //                ///11        ///////S N0978898IV          /  //                ///11        ////S N0978898IV          /  //                ///11        ////S N0978898IV          /  //                ///11        ////S N0978898IV          /  //                ///11        ///////S N0978898IV          /  //                ///11        ////S N0978898IV          /  //                ///11        ////S N0978898IV          /  //                ///11        ////S N0978898IV          /  //                ///11        ///////S N0978898IV          /  //                ///11        ////S N09
    99  78898IV          /  //                ///11        ////S N0978898IV          /  //                ///11        ////S N0978898IV          /  //                ///11        ///
        $

解决方案

You correctly read File2.txt before you read File1.txt. You need to ignore blank lines in File2.txt, though.

FNR == NR && ! /^[[:space:]]*$/ { key = substr($1, 1, 8); a[key] = $0; next }

This uses the first 8 characters of the first field as the key, and the whole line as the value. The next ensures that the lines are not otherwise processed.

The next part is fiddly. You need to spot the lines with 01 in $1, and build a key from that. When you next get an 01 line, you need to print out the 77-prefixed line from a (and delete the entry from a).

At the end, you need to print the 77-prefixed line from a (and delete the entry from a). Then you need to process any entries left in a and give them the 99-prefix.

$1 == "01" { if (code != 0)
             {
                 if (code in a)
                 {
                     printf("77\t%s\n", a[code])
                     delete a[code]
                 }
             }
             code = $4$3$2
           }
{ print }
END {
         if (code in a)
         {
             printf("77\t%s\n", a[code])
             delete a[code]
         }
         for (code in a)
             printf("99\t%s\n", a[code])
    }

Clearly, you can use less white space than I just did, though you might need to add some semicolons too. For testing purposes, I put the code above into a file awk.script and ran:

$ awk -f awk.script File2.txt File1.txt
01  89  68  5000
02  89  11
03  89  00
06  89  00
07  89  19  RT  0428
77  50006889 CCARD /3010  /E     /C A87545457          /  //                ///11        ///
01  87  23  5100
02  87  11
04  87  9   02
03  87  00
06  87  00
07  87  11  RT  0428
77  51002387 CCARD /3000  /E     /S N054896334IV          /  //                ///11        ///
01  83  23  4900
02  83  11
04  83  9   02
03  83  00
06  83  00
07  83  11  RT  0428
99  51002390800666 CCARD /3000  /E     /S N0978898IV          /  //                ///11        ///
$

That looks rather similar to what you wanted. If you want a blank line after the previous block of output, add printf("\n") after the if blocks that print the 77-prefixed lines. You can write it to File3.txt if you like with I/O redirection. You can embed the script in single quotes and add it to the command line in place of -f awk.script. You can squish the whole script onto one humongous line if you want to, too — but please don't; it is too big to make a good one-liner, and this program's name is awk, not apl.

这篇关于AWK - 2文件组合的数据,并打印到第3文件,如果匹配的钥匙的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆