用另一个文件中的行替换bash中的行 [英] Replace rows in bash with rows from another file

查看:60
本文介绍了用另一个文件中的行替换bash中的行的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有文件1:

sample_1    group_1
sample_2    group_1
sample_3    group_1
sample_4    group_2
sample_5    group_2
sample_6    group_2
sample_7    group_3
sample_8    group_3
sample_9    group_3

和文件2:

sample_8    group_3.1
sample_9    group_3.1

我想用文件2,列1的匹配行替换文件1,列2中的行,所以结果是:

I want to replace the rows in column 2, file 1 with the matching rows of file 2, column 1, so the result is:

sample_1    group_1
sample_2    group_1
sample_3    group_1
sample_4    group_2
sample_5    group_2
sample_6    group_2
sample_7    group_3
sample_8    group_3.1
sample_9    group_3.1

我最接近的是进行左连接: join -a1 -j 1 -o 1.1,1.2,2.2 <(sort -k1 file_1) <(sort -k1 file_2)

The nearest I have got is to do a left join: join -a1 -j 1 -o 1.1,1.2,2.2 <(sort -k1 file_1) <(sort -k1 file_2)

这给了我

sample_1 group_1 
sample_2 group_1 
sample_3 group_1 
sample_4 group_2 
sample_5 group_2 
sample_6 group_2 
sample_7 group_3 
sample_8 group_3 group_3.1
sample_9 group_3 group_3.1

然后我以为如果在第三列中重复文件1第二列,我可以删除第二列,但是当然不会发生.

Then I thought I could drop the second column if the file 1 second column was repeated in the third column, but of course this does not happen.

推荐答案

这是使用awk

awk 'FNR==NR {a[$1] = $0; next} ($1 in a) {$0 = a[$1]} 1' file2 file1
sample_1    group_1
sample_2    group_1
sample_3    group_1
sample_4    group_2
sample_5    group_2
sample_6    group_2
sample_7    group_3
sample_8    group_3.1
sample_9    group_3.1

FNR==NR {...; next}是标准语法,表示仅用于第一个输入的代码块.在这里,我们将第一个字段(整行)作为哈希保存:a[$1]=$0

FNR==NR {...; next} is a standard syntax that means a code block only for the first input. Into there we save using as a hash the first field, the whole line: a[$1]=$0

对文件1的第二个输入文件执行下一个操作:($1 in a)是表示哈希中是否存在第一个字段的条件.然后{$0=a[$1]}的意思是用该数组的已保存行替换该行.最后的1表示要打印.

The next is executed for the second input file, for file1: ($1 in a) is a condition that means if the first field exists in the hash. Then {$0=a[$1]} meaning replace the line with the saved line of that array. 1 at the end means to print.

加入.

如果要使用连接,可能首先需要获取file1行(这是当前使用的-a1),然后从第二个文件中打印出每个第一个字段的连接.最后再次排序.这里是命令分组:

If you want to use join, probably you have first to get the lines of file1 (this is the -a1 you use currently) then get the joined per first field printed from the second file. Finally sort this again. Here is with commands grouping:

(
    join -v1 -j1 file1 file2
    join -j1 -o 2.1,2.2 file1 file2
) | sort

这篇关于用另一个文件中的行替换bash中的行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆