查阅字段在两个固定格式文件值UNIX - 不工作 [英] lookup field values in two fixed format file in unix - not working

查看:143
本文介绍了查阅字段在两个固定格式文件值UNIX - 不工作的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有2个固定长度的文件输入#1安培;输入#2。我想基于37-50位置在这两个文件(POS 37-50将在这两个文件相同的值)。

值的行相匹配

如果发现任何匹配的记录再切对公司code和价值;从发票号码输入文件#1(直到行的结束位置99)。

切割字符串(从输入#1)需要在记录/行的末尾追加。

下面是code我想(不工作),并输入文件和放大器;所需的输出。请提供您的意见。

code:

 的awk'
NR == FNR和放大器;&安培; NF大于1 {
    V = SUBSTR($ 0,37,14);
#PRINT SUBSTR($ 0,37,14)
    下一个
}
NR == FNR和放大器;&安培; (/公司code / OR /发票编号/){
    子(/公司code /,,$ 0);
    子(/发票号/,,$ 0);
    一个[V] = $ 0;
打印$ 0个
    下一个
}
(在一个SUBSTR(0,37,14 $)){
    打印$ 0 [SUBSTR($ 0.99)
}Input1.txt input2.txt input3.txt

结束code

输入#1日开始开始与一些空格

  612 1111111111201402120000 2 1 111 211截止日期20140101
         612 1111111111201402120000 2 1 111 311公司code 227
         612 1111111111201402120000 2 1 111 411项code 12
         612 1111111111201402120000 2 1 111 511发票号码2014010
         612 1111111111201402120000 2 111 611公司code 214
         612 1111111111201402120000 2 111 711项code 20
         612 1111111111201402120000 2 111 811发票号码3014010
         612 1111111111201402120000 2 3 111 911截止日期20140101
         612 1111111111201402120000 2 3 111 111发票号码40140101
         612 1111111111201402120000 2 3 111 121用户code 15563263636
         612 1111111111201402120000 2 3 111 131到期金额100000
         612 111111111120140212000078978982123444 111 141截止日期20140101
         612 111111111120140212000078978982123444 111 151发票号码50140101
         612 111111111120140212000078978982123444 111 161到期金额008000

输入#1结束

输入#2开头
输入2

  510 77432201111010000 2 1 1ChK 100111000001 121000248 123456789 20111101.510.77432.20001C
         510 77432201111010000 2 1 2INv 20111101.510.77432.20001D
         510 77432201111010000 2 1 3INv 20111101.510.77432.20002D
         510 77432201111010000 2 1 4INv 20111101.510.77432.20003D
         510 77432201111010000 2 1 5INv 20111101.510.77432.20004D
         510 77432201111010000 2 1ChK 200111000002 121000248 123456789 20111101.510.77432.20002C
         510 77432201111010000 2 2INv 20111101.510.77432.20005D
         510 77432201111010000 2 3INv 20111101.510.77432.20006D
         510 77432201111010000 2 4INv 20111101.510.77432.20007D
         510 77432201111010000 2 5INv 20111101.510.77432.20008D
         510 77432201111010000 2 3 1ChK 300111000003 121000248 123456789 20111101.510.77432.20003C
         510 77432201111010000 2 3 2INv 20111101.510.77432.20009D
         510 77432201111010000 2 3 3INv 20111101.510.77432.20010D
         510 77432201111010000 2 3 4INv 20111101.510.77432.20011D
         510 77432201111010000 2 6 1ChK 600111000006 121000248 123456789 20111101.510.77432.20006C
         510 77432201111010000 2 6 2INv 20111101.510.77432.20021D
         510 77432201111010000 2 6 3INv 20111101.510.77432.20022D
         510 77432201111010000 2 6 4INv 20111101.510.77432.20023D
         510 77432201111010000 2 6 5INv 20111101.510.77432.20024D

输入#2月底

期望outout
所需的输出

  510 77432201111010000 2 1 1ChK 100111000001 121000248 123456789 20111101.510.77432.20001C 2272014010(公司和放大器;从输入1 INV#)
         510 77432201111010000 2 1 2INv 20111101.510.77432.20001D 2272014010
         510 77432201111010000 2 1 3INv 20111101.510.77432.20002D 2272014010
         510 77432201111010000 2 1 4INv 20111101.510.77432.20003D(公司和放大器;从输入1 INV#)
         510 77432201111010000 2 1 5INv 20111101.510.77432.20004D(公司和放大器;从输入1 INV#)
         510 77432201111010000 2 1ChK 200111000002 121000248 123456789 20111101.510.77432.20002C(公司&安培; INV#从输入1)
         510 77432201111010000 2 2INv 20111101.510.77432.20005D(公司和放大器;从输入1 INV#)
         510 77432201111010000 2 3INv 20111101.510.77432.20006D(公司和放大器;从输入1 INV#)
         510 77432201111010000 2 4INv 20111101.510.77432.20007D(公司和放大器;从输入1 INV#)
         510 77432201111010000 2 5INv 20111101.510.77432.20008D(公司和放大器;从输入1 INV#)
         510 77432201111010000 2 3 1ChK 300111000003 121000248 123456789 20111101.510.77432.20003C(公司&安培; INV#从输入1)
         510 77432201111010000 2 6 1ChK 600111000006 121000248 123456789 20111101.510.77432.20006C<还有就是在输入1中没有匹配的记录,这将是空白的>
         510 77432201111010000 2 6 2INv 20111101.510.77432.20021D<还有就是在输入1中没有匹配的记录,这将是空白的>
         510 77432201111010000 2 6 3INv 20111101.510.77432.20022D<还有就是在输入1中没有匹配的记录,这将是空白的>
         510 77432201111010000 2 6 4INv 20111101.510.77432.20023D<还有就是在输入1中没有匹配的记录,这将是空白的>
         510 77432201111010000 2 6 5INv 20111101.510.77432.20024D<还有就是在输入1中没有匹配的记录,这将是空白的>


解决方案

有你的 AWK code的几个问题。

让我们通过他们一步一步的:


  1. NR == FNR和放大器;&安培; NF大于1 {...;下一} NR == FNR&放大器;&放大器; ... - >在接下来将prevent第二的动作的从但所有正在执行第一次的记录


  2. NR == FNR和放大器;&安培; (/公司code / OR /发票编号/){ - > 不是一个有效的 AWK 声明中的逻辑或的是通过使用 || (如使用&放大器;&放; ,而不是


  3. 打印$ 0 [SUBSTR($ 0.99)] - > A [SUBSTR($ 0.99)] 需要一切从的记录的第99个位置的在你的第二个输入文件中查找你的阵中,但你的关键是从37-50。


我们可以解决这些问题的方式follwing:


  1. 在第一的动作的摆脱接下来并限制第三的动作记录


  2. 替换 ||


  3. 使用 SUBSTR($ 0,37,14)的关键在查找一个 SUBSTR(...,99)的结果。


这将导致以下code(删除你诊断打印命令和未使用的第3个输入文件):

 的awk'
NR == FNR和放大器;&安培; NF大于1 {
    V = SUBSTR($ 0,37,14);
}
NR == FNR和放大器;&安培; (/公司code / || /发票编号/){
    子(/公司code /,,$ 0);
    子(/发票号/,,$ 0);
    一个[V] = $ 0;
    下一个
}
NR = FNR和放大器;!&安培; (在一个SUBSTR(0,37,14 $)){
    打印$ 0个SUBSTR(A [SUBSTR(0,37,14 $),99)
}'input1.txt input2.txt

由于您的输入被关我无法重现您想要的输出,但我希望你可以从这里就看着办吧。

另外,我缩短你的code以下的版本做什么,我想,你想要它做从给定的输入启动:

 的awk'
{键= SUBSTR($ 0,37,14)}
NR == FNR {
  如果(/公司code / || /发票编号/)阵列[关键] = SUBSTR($ 0,98)
  下一个
}
(数组键){打印$ 0阵列[关键]}
input1.txt input2.txt

如果您需要调整/解释,随意评论。

I have 2 fixed length files input#1 & input#2. I want to match the rows based on the value in position 37-50 in both files (pos 37-50 will have same value in both files).

If any matching record is found then cut the value against company code & Invoice number from input file #1 (position 99 until end of line).

The cut string (from Input #1) need to be appended at the end of the record/line.

Below is the code I tried (not working) and the input files & desired output. Please provide your advice.

Code:

awk '
NR==FNR && NF>1 {
    v=substr($0,37,14);
#print substr($0,37,14)
    next
}
NR==FNR && ( /Company Code/ OR /Invoice Number/ ) {
    sub(/Company Code/,"",$0);
    sub(/Invoice Number/,"",$0);
    a[v]=$0;
print $0
    next
}
(substr($0,37,14) in a) {
    print $0 a[substr($0,99)]
}' Input1.txt input2.txt input3.txt

End code

Input #1 beginning Start's with some white spaces

         612  1111111111201402120000       2     1  111  211 Due Date                             20140101                           
         612  1111111111201402120000       2     1  111  311 Company Code                         227                                
         612  1111111111201402120000       2     1  111  411 Item Code                            12                                 
         612  1111111111201402120000       2     1  111  511 Invoice Number                       2014010                            
         612  1111111111201402120000       2     2  111  611 Company Code                         214                                
         612  1111111111201402120000       2     2  111  711 Item Code                            20                                 
         612  1111111111201402120000       2     2  111  811 Invoice Number                       3014010                            
         612  1111111111201402120000       2     3  111  911 Due Date                             20140101                           
         612  1111111111201402120000       2     3  111  111 Invoice Number                       40140101                           
         612  1111111111201402120000       2     3  111  121 user code                            15563263636                        
         612  1111111111201402120000       2     3  111  131 Amount Due                           100000                             
         612  111111111120140212000078978982123444  111  141 Due Date                             20140101                             
         612  111111111120140212000078978982123444  111  151 Invoice Number                       50140101                             
         612  111111111120140212000078978982123444  111  161 Amount Due                          008000                             

Input #1 End

Input #2 beginning input 2

         510       77432201111010000       2     1        1ChK          100111000001    121000248           123456789            20111101.510.77432.20001C                         
         510       77432201111010000       2     1        2INv                                                                   20111101.510.77432.20001D                         
         510       77432201111010000       2     1        3INv                                                                   20111101.510.77432.20002D                         
         510       77432201111010000       2     1        4INv                                                                   20111101.510.77432.20003D                         
         510       77432201111010000       2     1        5INv                                                                   20111101.510.77432.20004D                         
         510       77432201111010000       2     2        1ChK          200111000002    121000248           123456789            20111101.510.77432.20002C                         
         510       77432201111010000       2     2        2INv                                                                   20111101.510.77432.20005D                         
         510       77432201111010000       2     2        3INv                                                                   20111101.510.77432.20006D                         
         510       77432201111010000       2     2        4INv                                                                   20111101.510.77432.20007D                         
         510       77432201111010000       2     2        5INv                                                                   20111101.510.77432.20008D                         
         510       77432201111010000       2     3        1ChK          300111000003    121000248           123456789            20111101.510.77432.20003C                         
         510       77432201111010000       2     3        2INv                                                                   20111101.510.77432.20009D                         
         510       77432201111010000       2     3        3INv                                                                   20111101.510.77432.20010D                         
         510       77432201111010000       2     3        4INv                                                                   20111101.510.77432.20011D                         
         510       77432201111010000       2     6        1ChK          600111000006    121000248           123456789            20111101.510.77432.20006C                         
         510       77432201111010000       2     6        2INv                                                                   20111101.510.77432.20021D                         
         510       77432201111010000       2     6        3INv                                                                   20111101.510.77432.20022D                         
         510       77432201111010000       2     6        4INv                                                                   20111101.510.77432.20023D                         
         510       77432201111010000       2     6        5INv                                                                   20111101.510.77432.20024D                         

Input #2 end

Desired outout Desired output

         510       77432201111010000       2     1        1ChK          100111000001    121000248           123456789            20111101.510.77432.20001C   2272014010 (company & Inv # from input 1)                     
         510       77432201111010000       2     1        2INv                                                                   20111101.510.77432.20001D   2272014010                                            
         510       77432201111010000       2     1        3INv                                                                   20111101.510.77432.20002D   2272014010                                            
         510       77432201111010000       2     1        4INv                                                                   20111101.510.77432.20003D   (company & Inv # from input 1)                      
         510       77432201111010000       2     1        5INv                                                                   20111101.510.77432.20004D   (company & Inv # from input 1)                      
         510       77432201111010000       2     2        1ChK          200111000002    121000248           123456789            20111101.510.77432.20002C   (company & Inv # from input 1)                      
         510       77432201111010000       2     2        2INv                                                                   20111101.510.77432.20005D   (company & Inv # from input 1)                      
         510       77432201111010000       2     2        3INv                                                                   20111101.510.77432.20006D   (company & Inv # from input 1)                      
         510       77432201111010000       2     2        4INv                                                                   20111101.510.77432.20007D   (company & Inv # from input 1)                      
         510       77432201111010000       2     2        5INv                                                                   20111101.510.77432.20008D   (company & Inv # from input 1)                      
         510       77432201111010000       2     3        1ChK          300111000003    121000248           123456789            20111101.510.77432.20003C   (company & Inv # from input 1)                      
         510       77432201111010000       2     6        1ChK          600111000006    121000248           123456789            20111101.510.77432.20006C   <there is no matching record in input 1, this will be blank>                      
         510       77432201111010000       2     6        2INv                                                                   20111101.510.77432.20021D   <there is no matching record in input 1, this will be blank>                      
         510       77432201111010000       2     6        3INv                                                                   20111101.510.77432.20022D   <there is no matching record in input 1, this will be blank>                      
         510       77432201111010000       2     6        4INv                                                                   20111101.510.77432.20023D   <there is no matching record in input 1, this will be blank>                      
         510       77432201111010000       2     6        5INv                                                                   20111101.510.77432.20024D   <there is no matching record in input 1, this will be blank>                      

解决方案

There are several issues with your awk code.

Let's go through them step-by-step:

  1. NR==FNR && NF>1 {...;next}NR==FNR && ... --> the next will prevent the second action from being performed for all but the first record.

  2. NR==FNR && ( /Company Code/ OR /Invoice Number/ ) { --> OR is not a valid awk statement, the logical OR is done using || (like you use && and not AND).

  3. print $0 a[substr($0,99)] --> a[substr($0,99)] takes everything from the 99th position of the record in your 2nd input file to look up in your array, but your key is from 37-50.

We can fix them the follwing way:

  1. Get rid of the next in the first action and limit the 3rd action to records from the 2nd input file.

  2. Substitute OR with ||.

  3. Use substr($0,37,14) as key to lookup in a and substr(...,99) the result.

This results in the following code (removing your diagnostic print commands and the unused 3rd input file):

awk '
NR==FNR && NF>1 {
    v=substr($0,37,14);
}
NR==FNR && ( /Company Code/ || /Invoice Number/ ) {
    sub(/Company Code/,"",$0);
    sub(/Invoice Number/,"",$0);
    a[v]=$0;
    next
}
NR!=FNR && (substr($0,37,14) in a) {
    print $0 substr(a[substr($0,37,14)],99)
}' input1.txt input2.txt

Since your input was off I could not reproduce your desired output but I hope you can figure it out from here on.

Also, I shortened your code to the following version doing what I think you want it to do starting from the input given:

awk '
{key=substr($0,37,14)}
NR==FNR{
  if(/Company Code/||/Invoice Number/)array[key]=substr($0,98)
  next
}
(key in array){print $0,array[key]}
' input1.txt input2.txt

If you need adjustments/explanations, feel free to comment.

这篇关于查阅字段在两个固定格式文件值UNIX - 不工作的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆