perl的排序问题 [英] perl sort question

查看:110
本文介绍了perl的排序问题的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有我需要排序一些巨大的日志文件。所有参赛作品都有一个32位的十六进制数是那种关键我想使用。
有些条目是单行像

 唧唧歪歪0x97860afa喇嘛喇嘛

有的比较复杂一点,开始与上述相同类型的线路,扩大到由大括号如下面的例子标线块。在这种情况下,整个块具有移动到由十六进制NBR所定义的位置。块示例 -

 唧唧歪歪0x97860afc喇嘛喇嘛
     喇嘛喇嘛{
         布拉布拉
            喇嘛喇嘛{
                BLA
            }
        }

我大概可以算出它,但也许有一个简单的Perl或awk的解决方案,将节省1/2我一天。


从OP传输意见:


  

缩进可以用空格或制表符,我可以增强上任何建议的解决方案,我认为布莱恩总结得好:具体来说,你要被定义为文本块的开头项目进行排序含有0xNNNNNNNN一条线,并且包含一切最多(但不包括),其中包含一个0xNNNNNNNN下一行(此处的N的变化,当然)的。没有线穿插。



解决方案

像这样的东西可能会奏效(未测试):

 我的$行;
我的$的lastKey;
我%的数据;
而($行=<>){
    的Chomp $线;
    如果($行=〜/ \\ B(0X \\ p {} AHex {8})\\ B /){
        #开始一个新项目
        我的$ unique_key = $ 1 $; #名气为[布莱恩·杰拉德] [1]的唯一性
        $数据{$ 1} = $行;
        $ =的lastKey $ unique_key;
    }其他{
        #继续的旧条目
        $数据{$}的lastKey = $行。
    }
}打印$的数据{$ _},\\ n表示(排序{$ A< = GT; $ B}键%的数据);

的问题是,你说,大的日志文件,所以在存储器中存储的文件可能是无效率的。但是,如果要排序的话,我怀疑你会需​​要做的。

如果在内存中存储不是一个选项,你可以永远只是将数据打印到文件,而不是与这将允许您通过其他方式对它进行排序的格式。

I have some huge log files I need to sort. All entries have a 32 bit hex number which is the sort key I want to use. some entries are one liners like

bla bla bla  0x97860afa bla bla  

others are a bit more complex, start with the same type of line above and expand to a block of lines marked by curly brackets like the example below. In this case the entire block has to move to the position defined by the hex nbr. Block example-

 bla bla bla  0x97860afc bla bla  
     bla bla {  
         blabla  
            bla bla {  
                bla     
            }  
        }  

I can probably figure it out but maybe there is a simple perl or awk solution that will save me 1/2 day.


Transferring comments from OP:

Indentation can be space or tab, I can enhance that on any proposed solution, I think that Brian summarizes well: Specifically, do you want to sort "items" which are defined as a chunk of text that starts with a line containing a "0xNNNNNNNN", and contains everything up to (but not including) the next line which contains a "0xNNNNNNNN" (where the N's change, of course). No lines interspersed.

解决方案

Something like this might work (Not tested):

my $line;
my $lastkey;
my %data;
while($line = <>) {
    chomp $line;
    if ($line =~ /\b(0x\p{AHex}{8})\b/) {
        # Begin a new entry
        my $unique_key = $1 . $.; # cred to [Brian Gerard][1] for uniqueness
        $data{$1} = $line;
        $lastkey = $unique_key;
    } else {
        # Continue an old entry
        $data{$lastkey} .= $line;
    }
}

print $data{$_}, "\n" for (sort { $a <=> $b } keys %data);

The problem is that you said "huge" log files, so storing the file in memory will probably be inefficient. However, if you want to sort it, I suspect you're going to need to do that.

If storing in memory is not an option, you can always just print the data to a file instead, with a format that will allow you to sort it by some other means.

这篇关于perl的排序问题的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆