查阅字段在两个固定格式文件值UNIX - 不工作 [英] lookup field values in two fixed format file in unix - not working
问题描述
我有2个固定长度的文件输入#1安培;输入#2。我想基于37-50位置在这两个文件(POS 37-50将在这两个文件相同的值)。
值的行相匹配如果发现任何匹配的记录再切对公司code和价值;从发票号码输入文件#1(直到行的结束位置99)。
切割字符串(从输入#1)需要在记录/行的末尾追加。
下面是code我想(不工作),并输入文件和放大器;所需的输出。请提供您的意见。
code:
的awk'
NR == FNR和放大器;&安培; NF大于1 {
V = SUBSTR($ 0,37,14);
#PRINT SUBSTR($ 0,37,14)
下一个
}
NR == FNR和放大器;&安培; (/公司code / OR /发票编号/){
子(/公司code /,,$ 0);
子(/发票号/,,$ 0);
一个[V] = $ 0;
打印$ 0个
下一个
}
(在一个SUBSTR(0,37,14 $)){
打印$ 0 [SUBSTR($ 0.99)
}Input1.txt input2.txt input3.txt
结束code
输入#1日开始开始与一些空格
612 1111111111201402120000 2 1 111 211截止日期20140101
612 1111111111201402120000 2 1 111 311公司code 227
612 1111111111201402120000 2 1 111 411项code 12
612 1111111111201402120000 2 1 111 511发票号码2014010
612 1111111111201402120000 2 111 611公司code 214
612 1111111111201402120000 2 111 711项code 20
612 1111111111201402120000 2 111 811发票号码3014010
612 1111111111201402120000 2 3 111 911截止日期20140101
612 1111111111201402120000 2 3 111 111发票号码40140101
612 1111111111201402120000 2 3 111 121用户code 15563263636
612 1111111111201402120000 2 3 111 131到期金额100000
612 111111111120140212000078978982123444 111 141截止日期20140101
612 111111111120140212000078978982123444 111 151发票号码50140101
612 111111111120140212000078978982123444 111 161到期金额008000
输入#1结束
输入#2开头
输入2
510 77432201111010000 2 1 1ChK 100111000001 121000248 123456789 20111101.510.77432.20001C
510 77432201111010000 2 1 2INv 20111101.510.77432.20001D
510 77432201111010000 2 1 3INv 20111101.510.77432.20002D
510 77432201111010000 2 1 4INv 20111101.510.77432.20003D
510 77432201111010000 2 1 5INv 20111101.510.77432.20004D
510 77432201111010000 2 1ChK 200111000002 121000248 123456789 20111101.510.77432.20002C
510 77432201111010000 2 2INv 20111101.510.77432.20005D
510 77432201111010000 2 3INv 20111101.510.77432.20006D
510 77432201111010000 2 4INv 20111101.510.77432.20007D
510 77432201111010000 2 5INv 20111101.510.77432.20008D
510 77432201111010000 2 3 1ChK 300111000003 121000248 123456789 20111101.510.77432.20003C
510 77432201111010000 2 3 2INv 20111101.510.77432.20009D
510 77432201111010000 2 3 3INv 20111101.510.77432.20010D
510 77432201111010000 2 3 4INv 20111101.510.77432.20011D
510 77432201111010000 2 6 1ChK 600111000006 121000248 123456789 20111101.510.77432.20006C
510 77432201111010000 2 6 2INv 20111101.510.77432.20021D
510 77432201111010000 2 6 3INv 20111101.510.77432.20022D
510 77432201111010000 2 6 4INv 20111101.510.77432.20023D
510 77432201111010000 2 6 5INv 20111101.510.77432.20024D
输入#2月底
期望outout
所需的输出
510 77432201111010000 2 1 1ChK 100111000001 121000248 123456789 20111101.510.77432.20001C 2272014010(公司和放大器;从输入1 INV#)
510 77432201111010000 2 1 2INv 20111101.510.77432.20001D 2272014010
510 77432201111010000 2 1 3INv 20111101.510.77432.20002D 2272014010
510 77432201111010000 2 1 4INv 20111101.510.77432.20003D(公司和放大器;从输入1 INV#)
510 77432201111010000 2 1 5INv 20111101.510.77432.20004D(公司和放大器;从输入1 INV#)
510 77432201111010000 2 1ChK 200111000002 121000248 123456789 20111101.510.77432.20002C(公司&安培; INV#从输入1)
510 77432201111010000 2 2INv 20111101.510.77432.20005D(公司和放大器;从输入1 INV#)
510 77432201111010000 2 3INv 20111101.510.77432.20006D(公司和放大器;从输入1 INV#)
510 77432201111010000 2 4INv 20111101.510.77432.20007D(公司和放大器;从输入1 INV#)
510 77432201111010000 2 5INv 20111101.510.77432.20008D(公司和放大器;从输入1 INV#)
510 77432201111010000 2 3 1ChK 300111000003 121000248 123456789 20111101.510.77432.20003C(公司&安培; INV#从输入1)
510 77432201111010000 2 6 1ChK 600111000006 121000248 123456789 20111101.510.77432.20006C<还有就是在输入1中没有匹配的记录,这将是空白的>
510 77432201111010000 2 6 2INv 20111101.510.77432.20021D<还有就是在输入1中没有匹配的记录,这将是空白的>
510 77432201111010000 2 6 3INv 20111101.510.77432.20022D<还有就是在输入1中没有匹配的记录,这将是空白的>
510 77432201111010000 2 6 4INv 20111101.510.77432.20023D<还有就是在输入1中没有匹配的记录,这将是空白的>
510 77432201111010000 2 6 5INv 20111101.510.77432.20024D<还有就是在输入1中没有匹配的记录,这将是空白的>
有你的 AWK
code的几个问题。
让我们通过他们一步一步的:
-
NR == FNR和放大器;&安培; NF大于1 {...;下一} NR == FNR&放大器;&放大器; ...
- >在接下来
将prevent第二的动作的从但所有正在执行第一次的记录的 -
NR == FNR和放大器;&安培; (/公司code / OR /发票编号/){
- >或
不是一个有效的AWK
声明中的逻辑或的是通过使用||
(如使用&放大器;&放;
,而不是和
) -
打印$ 0 [SUBSTR($ 0.99)]
- >A [SUBSTR($ 0.99)]
需要一切从的记录的第99个位置的在你的第二个输入文件中查找你的阵中,但你的关键是从37-50。
我们可以解决这些问题的方式follwing:
-
在第一的动作的摆脱
接下来
并限制第三的动作到记录从第2输入文件EM>。 -
替换
或
与||
。 -
使用
SUBSTR($ 0,37,14)
的关键在查找一个
和SUBSTR(...,99)
的结果。
这将导致以下code(删除你诊断打印
命令和未使用的第3个输入文件):
的awk'
NR == FNR和放大器;&安培; NF大于1 {
V = SUBSTR($ 0,37,14);
}
NR == FNR和放大器;&安培; (/公司code / || /发票编号/){
子(/公司code /,,$ 0);
子(/发票号/,,$ 0);
一个[V] = $ 0;
下一个
}
NR = FNR和放大器;!&安培; (在一个SUBSTR(0,37,14 $)){
打印$ 0个SUBSTR(A [SUBSTR(0,37,14 $),99)
}'input1.txt input2.txt
由于您的输入被关我无法重现您想要的输出,但我希望你可以从这里就看着办吧。
另外,我缩短你的code以下的版本做什么,我想,你想要它做从给定的输入启动:
的awk'
{键= SUBSTR($ 0,37,14)}
NR == FNR {
如果(/公司code / || /发票编号/)阵列[关键] = SUBSTR($ 0,98)
下一个
}
(数组键){打印$ 0阵列[关键]}
input1.txt input2.txt
如果您需要调整/解释,随意评论。
I have 2 fixed length files input#1 & input#2. I want to match the rows based on the value in position 37-50 in both files (pos 37-50 will have same value in both files).
If any matching record is found then cut the value against company code & Invoice number from input file #1 (position 99 until end of line).
The cut string (from Input #1) need to be appended at the end of the record/line.
Below is the code I tried (not working) and the input files & desired output. Please provide your advice.
Code:
awk '
NR==FNR && NF>1 {
v=substr($0,37,14);
#print substr($0,37,14)
next
}
NR==FNR && ( /Company Code/ OR /Invoice Number/ ) {
sub(/Company Code/,"",$0);
sub(/Invoice Number/,"",$0);
a[v]=$0;
print $0
next
}
(substr($0,37,14) in a) {
print $0 a[substr($0,99)]
}' Input1.txt input2.txt input3.txt
End code
Input #1 beginning Start's with some white spaces
612 1111111111201402120000 2 1 111 211 Due Date 20140101
612 1111111111201402120000 2 1 111 311 Company Code 227
612 1111111111201402120000 2 1 111 411 Item Code 12
612 1111111111201402120000 2 1 111 511 Invoice Number 2014010
612 1111111111201402120000 2 2 111 611 Company Code 214
612 1111111111201402120000 2 2 111 711 Item Code 20
612 1111111111201402120000 2 2 111 811 Invoice Number 3014010
612 1111111111201402120000 2 3 111 911 Due Date 20140101
612 1111111111201402120000 2 3 111 111 Invoice Number 40140101
612 1111111111201402120000 2 3 111 121 user code 15563263636
612 1111111111201402120000 2 3 111 131 Amount Due 100000
612 111111111120140212000078978982123444 111 141 Due Date 20140101
612 111111111120140212000078978982123444 111 151 Invoice Number 50140101
612 111111111120140212000078978982123444 111 161 Amount Due 008000
Input #1 End
Input #2 beginning input 2
510 77432201111010000 2 1 1ChK 100111000001 121000248 123456789 20111101.510.77432.20001C
510 77432201111010000 2 1 2INv 20111101.510.77432.20001D
510 77432201111010000 2 1 3INv 20111101.510.77432.20002D
510 77432201111010000 2 1 4INv 20111101.510.77432.20003D
510 77432201111010000 2 1 5INv 20111101.510.77432.20004D
510 77432201111010000 2 2 1ChK 200111000002 121000248 123456789 20111101.510.77432.20002C
510 77432201111010000 2 2 2INv 20111101.510.77432.20005D
510 77432201111010000 2 2 3INv 20111101.510.77432.20006D
510 77432201111010000 2 2 4INv 20111101.510.77432.20007D
510 77432201111010000 2 2 5INv 20111101.510.77432.20008D
510 77432201111010000 2 3 1ChK 300111000003 121000248 123456789 20111101.510.77432.20003C
510 77432201111010000 2 3 2INv 20111101.510.77432.20009D
510 77432201111010000 2 3 3INv 20111101.510.77432.20010D
510 77432201111010000 2 3 4INv 20111101.510.77432.20011D
510 77432201111010000 2 6 1ChK 600111000006 121000248 123456789 20111101.510.77432.20006C
510 77432201111010000 2 6 2INv 20111101.510.77432.20021D
510 77432201111010000 2 6 3INv 20111101.510.77432.20022D
510 77432201111010000 2 6 4INv 20111101.510.77432.20023D
510 77432201111010000 2 6 5INv 20111101.510.77432.20024D
Input #2 end
Desired outout Desired output
510 77432201111010000 2 1 1ChK 100111000001 121000248 123456789 20111101.510.77432.20001C 2272014010 (company & Inv # from input 1)
510 77432201111010000 2 1 2INv 20111101.510.77432.20001D 2272014010
510 77432201111010000 2 1 3INv 20111101.510.77432.20002D 2272014010
510 77432201111010000 2 1 4INv 20111101.510.77432.20003D (company & Inv # from input 1)
510 77432201111010000 2 1 5INv 20111101.510.77432.20004D (company & Inv # from input 1)
510 77432201111010000 2 2 1ChK 200111000002 121000248 123456789 20111101.510.77432.20002C (company & Inv # from input 1)
510 77432201111010000 2 2 2INv 20111101.510.77432.20005D (company & Inv # from input 1)
510 77432201111010000 2 2 3INv 20111101.510.77432.20006D (company & Inv # from input 1)
510 77432201111010000 2 2 4INv 20111101.510.77432.20007D (company & Inv # from input 1)
510 77432201111010000 2 2 5INv 20111101.510.77432.20008D (company & Inv # from input 1)
510 77432201111010000 2 3 1ChK 300111000003 121000248 123456789 20111101.510.77432.20003C (company & Inv # from input 1)
510 77432201111010000 2 6 1ChK 600111000006 121000248 123456789 20111101.510.77432.20006C <there is no matching record in input 1, this will be blank>
510 77432201111010000 2 6 2INv 20111101.510.77432.20021D <there is no matching record in input 1, this will be blank>
510 77432201111010000 2 6 3INv 20111101.510.77432.20022D <there is no matching record in input 1, this will be blank>
510 77432201111010000 2 6 4INv 20111101.510.77432.20023D <there is no matching record in input 1, this will be blank>
510 77432201111010000 2 6 5INv 20111101.510.77432.20024D <there is no matching record in input 1, this will be blank>
There are several issues with your awk
code.
Let's go through them step-by-step:
NR==FNR && NF>1 {...;next}NR==FNR && ...
--> thenext
will prevent the second action from being performed for all but the first record.NR==FNR && ( /Company Code/ OR /Invoice Number/ ) {
-->OR
is not a validawk
statement, the logical OR is done using||
(like you use&&
and notAND
).print $0 a[substr($0,99)]
-->a[substr($0,99)]
takes everything from the 99th position of the record in your 2nd input file to look up in your array, but your key is from 37-50.
We can fix them the follwing way:
Get rid of the
next
in the first action and limit the 3rd action to records from the 2nd input file.Substitute
OR
with||
.Use
substr($0,37,14)
as key to lookup ina
andsubstr(...,99)
the result.
This results in the following code (removing your diagnostic print
commands and the unused 3rd input file):
awk '
NR==FNR && NF>1 {
v=substr($0,37,14);
}
NR==FNR && ( /Company Code/ || /Invoice Number/ ) {
sub(/Company Code/,"",$0);
sub(/Invoice Number/,"",$0);
a[v]=$0;
next
}
NR!=FNR && (substr($0,37,14) in a) {
print $0 substr(a[substr($0,37,14)],99)
}' input1.txt input2.txt
Since your input was off I could not reproduce your desired output but I hope you can figure it out from here on.
Also, I shortened your code to the following version doing what I think you want it to do starting from the input given:
awk '
{key=substr($0,37,14)}
NR==FNR{
if(/Company Code/||/Invoice Number/)array[key]=substr($0,98)
next
}
(key in array){print $0,array[key]}
' input1.txt input2.txt
If you need adjustments/explanations, feel free to comment.
这篇关于查阅字段在两个固定格式文件值UNIX - 不工作的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!