比较多个文件,并用awk列 [英] comparing multiple files and columns using awk
问题描述
我有两个文件,我想从文件3 $ C $ 2栏和3相匹配
文件1
2栏和3 C>。如果找到模式,我想从文件1
输出文件2
的整条生产线中,除了第1列结尾:
I have two files and I would like to match column 2 and 3 from file1
with column 2 and 3 from file3
. If the pattern is found, I would like to output the whole line from file2
with, in addition column 1 from file1
at the end:
我有以下两个文件类型:
(文件2
有很多列(设置页
分隔),但是,列2和3可以从符合2和三个文件1
)
I have the following two file-types:
(file2
has a lot of columns (tab
seperated) but, columns 2 and 3 can match 2 and three from file1
. )
的文件1 的
name1 1 12343442
name2 2 32434242
name3 3 982793749
的文件2 的
a 1 12343442 text1 text2 text3 value0 value2
a 1 12343442 text1 text2 text3 value2 value3
a 1 12348888 text1 text2 text3 value0 value2
b 3 982793749 text1 text4 text3 value1 value11
b 2 982793749 text1 text4 text3 value1 value11
所需的输出
a 1 12343442 text1 text2 text3 value0 value2 name1
a 1 12343442 text1 text2 text3 value2 value3 name1
b 3 982793749 text1 text4 text3 value1 value11 name3
我曾尝试使用这样 AWK
。是这样的:
awk 'BEGIN { FS = "\t" } NR==FNR { a[$1]=$2 FS $3; next} ('$2 FS $3' in a) {print $0, a[$1]}' file1 file2
但它不工作。即使我只是尝试匹配第三列这是行不通的。
这些文件是pretty大的> 500MB 的,所以我想读他们只有一次。
有任何想法吗?谢谢!
But it doesnt work. Even if I just try to match the third columns it does not work. The files are pretty big >500mb so I would like to read them only once. Any ideas? Thank you!
推荐答案
这一行应该工作:
awk -F'\t' -v OFS='\t' 'NR==FNR{a[$2FS$3]=$1;next}$2FS$3 in a{print $0,a[$2FS$3]}' file1 file2
在codeS
- 您有
A [$ 1] = $ 16 FS $ 3;接下来
,你是由键混淆
和值
。在这里你想要的$ 2FS $ 3
是关键,而$ 1
是值。 -
(在'$ 2 FS $ 3')
是不正确的或者,删除单引号
- you had
a[$1]=$2 FS $3;next
, you were confused by thekey
andvalue
. here you wanted the$2FS$3
to be key, and$1
to be the value. ('$2 FS $3' in a)
is not correct either, remove the single-quotes
这篇关于比较多个文件,并用awk列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!