Bash:在另一个文件中找到一个文件的模式,并打印出维护顺序的后者的对应字段 [英] Bash: find patterns of a file in another file and print out a corresponding field of the latter maintaining the order

查看:111
本文介绍了Bash:在另一个文件中找到一个文件的模式,并打印出维护顺序的后者的对应字段的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

Carissimi



我一直在尝试解决这个问题,并检查了很多帖子(例如grep,awk或sed?在一个文件中打印行匹配模式在另一个文件或这里 awk在另一个字段中搜索文件)没有真正找到我在找什么。我需要bash工具,如sed,grep,awk(无python,R,...)的解决方案。

我有两个文件(比这些文件大得多):

file1:

  2 891299 0.50923964E-02 1248 4.713 1349.08 
3 245857 0.57915542E-02 1335 4.671 1369.65

file2:

  278 2645 2334659 0.75142 0.53123 
279 2643 245857 0.80439 0.56868
500 1341 830677 0.74922 0.52958
501 1339 882791 0.87685 0.61980
502 1337 891299 0.63224 0.44680

在这个例子中我想找到第2列的模式file1在file2的第3列和第1列的后者,file1的所有行并维护file1给出的顺序。



一个可能的解决方案(我知道是不是错误免费)是以下难以接受的慢bash循环:

  for i in'awk'{print $ 2}'file1`;做grep$我file2 | awk'{print $ 1}';完成

打印到屏幕:

  502 

279

请注意一个'简单'的解决方案,如:

  awk'NR == FNR {pats [$ 2];下一步} $ 3在拍打'file1 file2 

是不合适的,因为打印顺序由file2而不是通过file1(即它打印到屏幕上第一个279和然后502)。

非常感谢您的帮助。



Marco

解决方案

您可以反向文件以在awk中处理并获得正确的输出:

  awk'NR == FNR {pats [$ 3] = $ 1;下一个} $ pat in {print pats [$ 2]}'file2 file1 
502
279


Carissimi,

I've been trying for a while to solve this problem and I checked many posts (for example here grep, awk or sed? Print lines in one file matching patterns in another file or here awk search for a field in another file) without really finding what I am looking for. I need the solution with bash tools like sed, grep, awk (no python, R,...)

I have two files (much bigger than those):

file1:

   2   891299  0.50923964E-02     1248   4.713       1349.08
   3   245857  0.57915542E-02     1335   4.671       1369.65

file2:

   278    2645  2334659  0.75142      0.53123
   279    2643   245857  0.80439      0.56868
   500    1341   830677  0.74922      0.52958
   501    1339   882791  0.87685      0.61980
   502    1337   891299  0.63224      0.44680

In this example I want to find the pattern in column 2 of file1 in column 3 of file2 and print column 1 of the latter, for all the lines of file1 and maintaining the order given by file1.

A possible solution (I am aware is not bug free) is the following unacceptably slow bash loop:

for i in `awk '{print $2}' file1` ; do grep " $i " file2 | awk '{print $1}' ; done

which prints to screen:

502

279

Please note that a 'simple' solution like:

awk 'NR==FNR{pats[$2]; next} $3 in pats' file1 file2

is not appropriate as the order of the printing is given by file2 and not by file1 (i.e. it prints to screen first 279 and then 502).

Thanks a lot for your help.

Marco

解决方案

You can reverse files to be processed in awk and get the right output:

awk 'NR==FNR{pats[$3]=$1; next} $2 in pats{print pats[$2]}' file2 file1
502
279

这篇关于Bash:在另一个文件中找到一个文件的模式,并打印出维护顺序的后者的对应字段的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆