命令行与第一场(SED,AWK等)配套匹配线 [英] Command line to match lines with matching first field (sed, awk, etc.)

查看:116
本文介绍了命令行与第一场(SED,AWK等)配套匹配线的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

什么是快速而简洁的方式来与匹配的第一场相匹配的文本文件中的行。

What is fast and succinct way to match lines from a text file with a matching first field.

样品输入:

a|lorem
b|ipsum
b|dolor
c|sit
d|amet
d|consectetur
e|adipisicing
e|elit

所需的输出:

b|ipsum
b|dolor
d|amet
d|consectetur
e|adipisicing
e|elit

所需的输出,替代方案:

Desired output, alternative:

b|ipsum|dolor
d|amet|consectetur
e|adipisicing|elit

我可以想像很多方法来写这篇文章,但我怀疑有一个聪明的办法做到这一点,例如,使用sed,awk中,等等。我的源文件大约0.5 GB。

I can imagine many ways to write this, but I suspect there's a smart way to do it, e.g., with sed, awk, etc. My source file is approx 0.5 GB.

有一些相关的问题在这里,如 AWK |场匹配的的基础上合并线,但其它问题负荷太多的内容到存储器中。我需要一个流的方法。

There are some related questions here, e.g., "awk | merge line on the basis of field matching", but that other question loads too much content into memory. I need a streaming method.

推荐答案

下面是一个方法,你只需要记住previous线(因此需要进行排序输入文件)

Here's a method where you only have to remember the previous line (therefore requires the input file to be sorted)

awk -F \| '
    $1 == prev_key {print prev_line; matches ++}
    $1 != prev_key {                            
        if (matches) print prev_line
        matches = 0
        prev_key = $1
    }                
    {prev_line = $0}
    END { if (matches) print $0 }
' filename

b|ipsum
b|dolor
d|amet
d|consectetur
e|adipisicing
e|elit

备用输出

awk -F \| '
    $1 == prev_key {
        if (matches == 0) printf "%s", $1 
        printf "%s%s", FS, prev_value
        matches ++
    }             
    $1 != prev_key {
        if (matches) printf "%s%s\n", FS, prev_value
        matches = 0                                 
        prev_key = $1
    }                
    {prev_value = $2}
    END {if (matches) printf "%s%s\n", FS, $2}
' filename

b|ipsum|dolor
d|amet|consectetur
e|adipisicing|elit

这篇关于命令行与第一场(SED,AWK等)配套匹配线的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆