用另一个文件中的行替换bash中的行 [英] Replace rows in bash with rows from another file
问题描述
我有文件1:
sample_1 group_1
sample_2 group_1
sample_3 group_1
sample_4 group_2
sample_5 group_2
sample_6 group_2
sample_7 group_3
sample_8 group_3
sample_9 group_3
和文件2:
sample_8 group_3.1
sample_9 group_3.1
我想用文件2,列1的匹配行替换文件1,列2中的行,所以结果是:
I want to replace the rows in column 2, file 1 with the matching rows of file 2, column 1, so the result is:
sample_1 group_1
sample_2 group_1
sample_3 group_1
sample_4 group_2
sample_5 group_2
sample_6 group_2
sample_7 group_3
sample_8 group_3.1
sample_9 group_3.1
我最接近的是进行左连接:
join -a1 -j 1 -o 1.1,1.2,2.2 <(sort -k1 file_1) <(sort -k1 file_2)
The nearest I have got is to do a left join:
join -a1 -j 1 -o 1.1,1.2,2.2 <(sort -k1 file_1) <(sort -k1 file_2)
这给了我
sample_1 group_1
sample_2 group_1
sample_3 group_1
sample_4 group_2
sample_5 group_2
sample_6 group_2
sample_7 group_3
sample_8 group_3 group_3.1
sample_9 group_3 group_3.1
然后我以为如果在第三列中重复文件1第二列,我可以删除第二列,但是当然不会发生.
Then I thought I could drop the second column if the file 1 second column was repeated in the third column, but of course this does not happen.
推荐答案
这是使用awk
awk 'FNR==NR {a[$1] = $0; next} ($1 in a) {$0 = a[$1]} 1' file2 file1
sample_1 group_1
sample_2 group_1
sample_3 group_1
sample_4 group_2
sample_5 group_2
sample_6 group_2
sample_7 group_3
sample_8 group_3.1
sample_9 group_3.1
FNR==NR {...; next}
是标准语法,表示仅用于第一个输入的代码块.在这里,我们将第一个字段(整行)作为哈希保存:a[$1]=$0
FNR==NR {...; next}
is a standard syntax that means a code block only for the first input. Into there we save using as a hash the first field, the whole line: a[$1]=$0
对文件1的第二个输入文件执行下一个操作:($1 in a)
是表示哈希中是否存在第一个字段的条件.然后{$0=a[$1]}
的意思是用该数组的已保存行替换该行.最后的1
表示要打印.
The next is executed for the second input file, for file1: ($1 in a)
is a condition that means if the first field exists in the hash. Then {$0=a[$1]}
meaning replace the line with the saved line of that array. 1
at the end means to print.
加入.
如果要使用连接,可能首先需要获取file1
行(这是当前使用的-a1
),然后从第二个文件中打印出每个第一个字段的连接.最后再次排序.这里是命令分组:
If you want to use join, probably you have first to get the lines of file1
(this is the -a1
you use currently) then get the joined per first field printed from the second file. Finally sort this again. Here is with commands grouping:
(
join -v1 -j1 file1 file2
join -j1 -o 2.1,2.2 file1 file2
) | sort
这篇关于用另一个文件中的行替换bash中的行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!