awk要求合并两个文件 [英] Awk asking combine two files
问题描述
我已经通过AWK命令将两个不同的文件与Same Key结合在一起.如果没有键匹配,则将其与File1和File2进行比较放"\ t \ t \ t"代替.
I have combined two different files with Same Key by AWK command. And In case there is no key match compare to File1 and File2 then just put "\t\t\t" instead.
我有以下AWK命令.
awk -F"\t" '
{key = $1}
NR == 1 {header = key}
!(key in result) {result[key] = $0 ; next}
{ for (i=2; i <= NF; i++) result[key] = result[key] FS $i }
END {
print result[header],"\tValue2","\tValue3","\tValue4"
delete result[header]
PROCINFO["sorted_in"] = "@ind_str_asc" # if using GNU awk
for (key in result) print result[key]
}
' $1 $2 > $3
示例组合File1
Key Value1
A 10000
B 20000
C 30000
D 40000
File2
B 50000 20000 10000
C 20000 10000 50000
然后获得预期结果
Key Value1 Value2 Value3 Value4
A 10000 - - -
B 20000 50000 20000 10000
C 30000 20000 10000 50000
D 40000 - - -
我的AWK命令显示
Key Value1 Value2 Value3 Value4
A 10000
B 20000 50000 20000 10000
C 30000 20000 10000 50000
D 40000
我已经尝试了以下几种方法
I already tried to do few way like below
!(key in result) {result[key] = $0"\t-\t-\t-" ; next}
但是看起来这并不涵盖所有情况.有人有更好的主意吗?谢谢!
But looks like this is does not cover all cases. Anyone has better idea to do this? Thank you!
推荐答案
此解决方案不会硬编码File2中还有3个额外字段
This solution doesn't hardcode that there are 3 extra fields in File2
awk '
BEGIN { FS = OVS = "\t" }
NR == FNR {
key = $1
$1 = ""
store[key] = $0
num_extra_fields = NF-1
next
}
FNR == 1 {
printf "%s", $0
for (i=1; i <= num_extra_fields; i++)
printf "%sValue%d", OFS, i+(NF-1)
print ""
next
}
$1 in store {
print $0 store[key]
next
}
{
for (i=1; i <= num_extra_fields; i++)
$(++NF)="-"
print
}
' file2 file1
由于stackoverflow如何显示选项卡,输出看起来有些奇怪
The output looks a bit odd due to how stackoverflow displays tabs
Key Value1 Value2 Value3 Value4
A 10000 - - -
B 20000 20000 10000 50000
C 30000 20000 10000 50000
D 40000 - - -
要修复代码,您需要跟踪file2中更新结果的键.更改
To fix your code, you need to keep track of the keys in file2 that update the results. Change
{ for (i=2; i <= NF; i++) result[key] = result[key] FS $i }
到
{ updated[key]=1; for (i=2; i <= NF; i++) result[key] = result[key] FS $i }
,然后在END块中更改
and, in the END block, change
for (key in result) print result[key]
到
for (key in result) {
if (!(key in updated)) result[key] = result[key] FS "-" FS "-" FS "-"
print result[key]
}
这篇关于awk要求合并两个文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!