AWK:使用文件过滤另一个文件(out.tr) [英] Awk: using a file to filter another one (out.tr)

查看:387
本文介绍了AWK:使用文件过滤另一个文件(out.tr)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

帮助awk,使用文件过滤另一个文件 我有一个主文件:

Help with awk, using a file to filter another one I have a main file:

...
17,466971 0,095185 17,562156 id 676
17,466971 0,096694 17,563665 id 677
17,466971 0,09816 17,565131 id 678
17,466971 0,099625 17,566596 id 679
17,466971 0,101091 17,568062 id 680
17,466971 0,016175 17,483146 id 681
17,466971 0,101793 17,568764 id 682
17,466971 0,10253 17,569501 id 683
38,166772 0,08125 38,248022 id 1572
38,166772 0,082545 38,249317 id 1573
38,233772 0,005457 38,239229 id 1574
38,233772 0,082113 38,315885 id 1575
38,299771 0,081412 38,381183 id 1576
38,299771 0,006282 38,306053 id 1577
38,299771 0,083627 38,383398 id 1578
38,299771 0,085093 38,384864 id 1579
38,299771 0,008682 38,308453 id 1580
38,299771 0,085094 38,384865 id 1581
...

我想基于此其他文件(最后一列(id))取消/删除某些行:

I want to suppress/delete some lines based on this other file, last column (id) :

...
d 17.483146 1 0 udp 181 ------- 1 19.0 2.0 681
d 38.239229 1 0 udp 571 ------- 1 19.0 2.0 1574
d 38.306053 1 0 udp 1000 ------- 1 19.0 2.0 1577
d 38.308453 1 0 udp 1000 ------- 1 19.0 2.0 1580
d 38.372207 1 0 udp 546 ------- 1 19.0 2.0 1582
d 38.441845 1 0 udp 499 ------- 1 19.0 2.0 1585
d 38.505262 1 0 udp 616 ------- 1 19.0 2.0 1586
d 38.572324 1 0 udp 695 ------- 1 19.0 2.0 1588
d 38.639246 1 0 udp 597 ------- 1 19.0 2.0 1590
d 38.639758 1 0 udp 640 ------- 1 19.0 2.0 1591 
...

对于上面的示例,结果将是:

For the example above, the result would be:

17,466971 0,095185 17,562156 id 676
17,466971 0,096694 17,563665 id 677
17,466971 0,09816 17,565131 id 678
17,466971 0,099625 17,566596 id 679
17,466971 0,016175 17,483146 id 680
17,466971 0,101793 17,568764 id 682
17,466971 0,10253 17,569501 id 683
38,166772 0,08125 38,248022 id 1572
38,166772 0,082545 38,249317 id 1573
38,233772 0,082113 38,315885 id 1575
38,299771 0,081412 38,381183 id 1576
38,299771 0,083627 38,383398 id 1578
38,299771 0,085093 38,384864 id 1579
38,299771 0,085094 38,384865 id 1581

行删除为:

17,466971 0,101091 17,568062 id 681
38,233772 0,005457 38,239229 id 1574
38,299771 0,006282 38,306053 id 1577
38,299771 0,008682 38,308453 id 1580

是否有使用awk的命令使其自动执行?

Is there a command using awk to make this automatic?

提前谢谢

推荐答案

这是使用awk的一种方法:

awk 'FNR==NR { a[$NF]; next } !($NF in a)' other main

结果:

17,466971 0,095185 17,562156 id 676
17,466971 0,096694 17,563665 id 677
17,466971 0,09816 17,565131 id 678
17,466971 0,099625 17,566596 id 679
17,466971 0,101091 17,568062 id 680
17,466971 0,101793 17,568764 id 682
17,466971 0,10253 17,569501 id 683
38,166772 0,08125 38,248022 id 1572
38,166772 0,082545 38,249317 id 1573
38,233772 0,082113 38,315885 id 1575
38,299771 0,081412 38,381183 id 1576
38,299771 0,083627 38,383398 id 1578
38,299771 0,085093 38,384864 id 1579
38,299771 0,085094 38,384865 id 1581

删除感叹号以显示已删除"行:

Drop the exclamation mark to show the 'deleted' lines:

awk 'FNR==NR { a[$NF]; next } $NF in a' other main

结果:

17,466971 0,016175 17,483146 id 681
38,233772 0,005457 38,239229 id 1574
38,299771 0,006282 38,306053 id 1577
38,299771 0,008682 38,308453 id 1580


或者,如果您想要两个文件,一个文件包含值"present",另一个文件包含值"deleted",请尝试:


Alternatively, if you'd like two files, one containing values 'present' and the other containing values 'deleted', try:

awk 'FNR==NR { a[$NF]; next } { print > ($NF in a ? "deleted" : "present") }' other main


说明1:

FNR==NR { ... }是一种常用的构造,仅对参数列表中的第一个文件返回true.在这种情况下,awk将首先读取文件"other".处理此文件时,最后一列($NF)中的值将添加到数组(我们称为a)中. next然后跳过对其余代码的处理.读取第一个文件后,FNR将不再等于NR,因此将允许" awk跳过FNR--NR { ... }块并开始处理应用于该代码的其余代码.参数列表中的第二个文件"main".例如,如果$NF不在数组中,则!($NF in a)将不会打印该行.

FNR==NR { ... } is a commonly used construct that returns true for only the first file in the arguments list. In this case, awk will read the file 'other' first. When this file is being processed, the value in the last column ($NF) is added to an array (which we have called a). next then skips processing the rest of our code. Once the first file has been read, FNR will no longer be equal to NR, thus awk will be 'allowed' to skip the FNR--NR { ... } block and begin processing the remainder of the code which is applied to the second file in the arguments list, 'main'. For example, !($NF in a), will not print the line if $NF is not in the array.

说明2:

关于哪一列,您可能会发现有帮助:

With regards to which column, you may find this helpful:

$1         # the first column
$2         # the second column
$3         # the third column

$NF        # the last column
$(NF-1)    # the second last column
$(NF-2)    # the third last column

这篇关于AWK:使用文件过滤另一个文件(out.tr)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆