awk中，由线格局重组 [英] awk, regroup by lines pattern

查看：108 发布时间：2016/7/28 16:53:44 perl awk

本文介绍了awk中，由线格局重组的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有这个输入文件

和我想它是这样，作为输出

和计算这些信息

 字段1场2 nbrepeated时间1时间2时间3时间4
    3 4 1 1.00 243.0 0 0
    2 3 1 93.0 243.0 0 0
    0 2 2 93.00 119.00 237.00 581.00
    ：：：：：
    ：：：：：
    ：：：：：
    ＆LT;＆FIELD1 GT; ＆LT;＆的field2 GT; ＆LT; nbrepeated＆GT; ＆LT;＆时间1 GT; ＆LT;＆时间2 GT; ＆LT;＆时间3 GT; ＆LT;＆时间4 GT;在列

解决方案

一个Perl版本：

 使用严格的;
使用警告;我％的数据;
而（我的$行=＆lt;数据＆GT;）{
    格格（$线）;
    我@row =拆分（/ \\ s /，$线）;
    我的$键= $行[1]。 $行[2];
    推@ {$数据{$关键}} $行[0];
}我最大$ = 0;
我的$键（按键％的数据）{
    如果（标量@ {$数据{$关键}}＆GT; $最大）{
        $最大= @标{$数据{$关键}};
    }
}{
    我@times;
    推@times，时间。 $ _为（1 .. $最大值）;
    myFormat（字段1，字段2，nbrepeated，@times）;
}我的$键（按键％的数据）{
    我（$ F1，F2 $）=拆分（//，$键）;
    我的$ NR = $＃{$数据{$关键}};
    我@times = @ {$数据{$关键}};
    为（我的$ I = 0; $ I＆LT; $最大; $ I ++）{
        如果（！$定义倍[$ i]）{
            $倍[$ i] = 0;
        }
    }
    myFormat（$ F1，F2 $，$ NR，@times）;
}子myFormat {
    printf的％-8S％-8S％-12s％-8S，移位，移位，移位，移位;
    我的$线（@_）{
        printf的％-8S，$线;
    }
    打印\\ n;
}__数据__
1.00 3 4
93.00 2 3
105.00 0 2
119.00 0 2
122.00 1 4
202.00 1 3
207.00 1 2
210.00 1 4
236.00 0 1
237.00 0 4
237.00 0 2
240.00 1 3
243.00 2 3
243.00 3 4
243.00 0 3
275.00 0 4
275.00 2 4
353.00 0 3
361.00 1 4
411.00 0 1
412.00 1 3
425.00 0 3
426.00 0 4
455.00 1 4
464.00 0 3
520.00 0 4
560.00 1 3
561.00 1 4
581.00 0 2

生成的输出：

 字段1场2 nbrepeated时间1时间2时间3时间4 time5
0 1 1 236.00 411.00 0 0 0
0 4 3 237.00 275.00 426.00 520.00 0
1 2 0 207.00 0 0 0 0
1 4 4 122.00 210.00 361.00 455.00 561.00
0 2 3 105.00 119.00 237.00 581.00 0
3 4 1 1.00 243.00 0 0 0
0 3 3 243.00 353.00 425.00 464.00 0
2 4 0 275.00 0 0 0 0
2 3 1 93.00 243.00 0 0 0
1 3 3 202.00 240.00 412.00 560.00 0

输出未排序。排序这将是没有问题的，如果你指定你想要的方式，来分类的。

i have this input file

and i want to it be like this as the output

and compute this information

    field1 field2 nbrepeated time1 time2  time3   time4
    3      4      1          1.00  243.0  0       0
    2      3      1          93.0  243.0  0       0
    0      2      2          93.00 119.00 237.00  581.00 
    :      :      :          :     :      :       :
    :      :      :          :     :      :       :
    :      :      :          :     :      :       :




    <field1> <field2> <nbrepeated> <time1> <time2> <time3> <time4> are columns

解决方案

A perl version:

use strict;
use warnings;

my %data;
while (my $line = <DATA>) {
    chomp($line);
    my @row = split(/\s/, $line);
    my $key = $row[1] . $row[2];
    push @{$data{$key}}, $row[0];
}

my $max = 0;
for my $key (keys %data) {
    if (scalar @{$data{$key}} > $max) {
        $max = scalar @{$data{$key}};
    }
}

{
    my @times;
    push @times, "time" . $_ for (1 .. $max);
    myFormat("field1", "field2", "nbrepeated", @times);
}

for my $key (keys %data) {
    my ($f1, $f2) = split (//, $key);
    my $nr = $#{$data{$key}};
    my @times = @{$data{$key}};
    for (my $i = 0; $i < $max; $i++) {
        if (! defined $times[$i] ) {
            $times[$i] = 0;
        }
    }
    myFormat($f1, $f2, $nr, @times);
}

sub myFormat {
    printf "%-8s %-8s %-12s %-8s ", shift, shift, shift, shift;
    for my $line (@_) {
        printf "%-8s ", $line;
    }
    print "\n";
}

__DATA__
1.00 3 4
93.00 2 3
105.00 0 2
119.00 0 2
122.00 1 4
202.00 1 3
207.00 1 2
210.00 1 4
236.00 0 1
237.00 0 4
237.00 0 2
240.00 1 3
243.00 2 3
243.00 3 4
243.00 0 3
275.00 0 4
275.00 2 4
353.00 0 3
361.00 1 4
411.00 0 1
412.00 1 3
425.00 0 3
426.00 0 4
455.00 1 4
464.00 0 3
520.00 0 4
560.00 1 3
561.00 1 4
581.00 0 2

Produces the output:

field1   field2   nbrepeated   time1    time2    time3    time4    time5
0        1        1            236.00   411.00   0        0        0
0        4        3            237.00   275.00   426.00   520.00   0
1        2        0            207.00   0        0        0        0
1        4        4            122.00   210.00   361.00   455.00   561.00
0        2        3            105.00   119.00   237.00   581.00   0
3        4        1            1.00     243.00   0        0        0
0        3        3            243.00   353.00   425.00   464.00   0
2        4        0            275.00   0        0        0        0
2        3        1            93.00    243.00   0        0        0
1        3        3            202.00   240.00   412.00   560.00   0

The output is unsorted. Sorting it would be no problem, if you specify what way you want it sorted.

这篇关于awk中，由线格局重组的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

awk中，由线格局重组 [英] awk, regroup by lines pattern

问题描述

相关文章

Linux/Unix最新文章

热门教程

热门工具

登录关闭

awk中，由线格局重组 [英] awk, regroup by lines pattern

问题描述

相关文章

Linux/Unix最新文章

热门教程

热门工具

登录 关闭

登录关闭