从文件中读取行,在第二个文件中grep,并为每个$ line输出一个文件 [英] Read lines from a file, grep in a second file, and output a file for each $line

查看:39
本文介绍了从文件中读取行,在第二个文件中grep,并为每个$ line输出一个文件的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有以下两个文件:

sequences.txt

158333741       Acaryochloris_marina_MBIC11017_uid58167 158333741       432     1       432     COG0001 0
158339504       Acaryochloris_marina_MBIC11017_uid58167 158339504       491     1       491     COG0002 0
379012832       Acetobacterium_woodii_DSM_1030_uid88073 379012832       430     1       430     COG0001 0
302391336       Acetohalobium_arabaticum_DSM_5501_uid51423      302391336       441     1       441     COG0003 0
311103820       Achromobacter_xylosoxidans_A8_uid59899  311103820       425     1       425     COG0004 0
332795879       Acidianus_hospitalis_W1_uid66875        332795879       369     1       369     COG0005 0
332796307       Acidianus_hospitalis_W1_uid66875        332796307       416     1       416     COG0005 0

allids.txt

COG0001
COG0002
COG0003
COG0004
COG0005

现在,我想读取 allids.txt 中的每一行,搜索 sequences.txt 中的所有行(特别是在第7列中),并为每个编写 allids.txt 中的line 文件名为 $ line 的文件.

Now I want to read each line in allids.txt, search all lines in sequences.txt (specifically in column 7), and write for each line in allids.txt a file with the filename $line.

我的方法是使用简单的grep:

my approach is to use a simple grep:

while read line; do
  grep "$line" sequences.txt
done <allids.txt

但是我应该在哪里合并用于输出的命令?如果有更快的命令,请随时提出建议!

but where do I incorporate the command for the output? If there is a command that is faster, feel free to suggest!

我的预期输出:

COG0001.txt

158333741       Acaryochloris_marina_MBIC11017_uid58167 158333741       432     1       432     COG0001 0
379012832       Acetobacterium_woodii_DSM_1030_uid88073 379012832       430     1       430     COG0001 0

COG0002.txt

158339504       Acaryochloris_marina_MBIC11017_uid58167 158339504       491     1       491     COG0002 0

[以此类推]

推荐答案

我怀疑您真正需要的是:

I suspect all you really need is:

awk '{print > ($7".txt")}' sequences.txt

这种怀疑是基于您的ID文件被命名为 allIds.txt (请注意 all ),并且 sequences.txt 在 allIds.txt 中不存在.

That suspicion is based on your IDs file being named allIds.txt (note the all) and there being no IDs in sequences.txt that don't exist in allIds.txt.

这篇关于从文件中读取行,在第二个文件中grep,并为每个$ line输出一个文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆