在另一个文件中查找文件的模式,并打印出后者的相应字段,以保持顺序 [英] Find patterns of a file in another file and print out a corresponding field of the latter maintaining the order

查看:84
本文介绍了在另一个文件中查找文件的模式,并打印出后者的相应字段,以保持顺序的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我已经尝试了一段时间以解决此问题,并且检查了许多帖子(例如,此处 awk在另一个文件中搜索一个字段)而没有真正找到我想要的东西.我需要使用sed,grep,awk(没有python,R ...)之类的bash工具解决方案

I've been trying for a while to solve this problem and I checked many posts (for example here Print lines in one file matching patterns in another file or here awk search for a field in another file) without really finding what I am looking for. I need the solution with bash tools like sed, grep, awk (no python, R,...)

我有两个文件(比那些大得多):

I have two files (much bigger than those):

文件1:

   2   891299  0.50923964E-02     1248   4.713       1349.08
   3   245857  0.57915542E-02     1335   4.671       1369.65

文件2:

   278    2645  2334659  0.75142      0.53123
   279    2643   245857  0.80439      0.56868
   500    1341   830677  0.74922      0.52958
   501    1339   882791  0.87685      0.61980
   502    1337   891299  0.63224      0.44680

在此示例中,我想在file2的第3列中找到file1的第2列中的模式,并打印出file2的第1列,并保持file1给出的顺序.

In this example I want to find the pattern in column 2 of file1 in column 3 of file2 and print column 1 of the latter, for all the lines of file1 and maintaining the order given by file1.

一个可能的解决方案(我知道这不是没有错误的)是以下令人无法接受的慢bash循环:

A possible solution (I am aware is not bug free) is the following unacceptably slow bash loop:

for i in `awk '{print $2}' file1` ; do grep " $i " file2 | awk '{print $1}' ; done

打印到屏幕上的

502

279

请注意,像这样的简单"解决方案:

Please note that a 'simple' solution like:

awk 'NR==FNR{pats[$2]; next} $3 in pats' file1 file2

不适当,因为打印顺序是由file2而不是file1给出的(即先打印到屏幕279,然后是502).

is not appropriate as the order of the printing is given by file2 and not by file1 (i.e. it prints to screen first 279 and then 502).

非常感谢您的帮助.

推荐答案

您可以撤消要在awk中处理的文件并获得正确的输出:

You can reverse files to be processed in awk and get the right output:

awk 'NR==FNR{pats[$3]=$1; next} $2 in pats{print pats[$2]}' file2 file1
502
279

这篇关于在另一个文件中查找文件的模式,并打印出后者的相应字段,以保持顺序的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆