perl的排序问题 [英] perl sort question
问题描述
我有我需要排序一些巨大的日志文件。所有参赛作品都有一个32位的十六进制数是那种关键我想使用。
有些条目是单行像
唧唧歪歪0x97860afa喇嘛喇嘛
有的比较复杂一点,开始与上述相同类型的线路,扩大到由大括号如下面的例子标线块。在这种情况下,整个块具有移动到由十六进制NBR所定义的位置。块示例 -
唧唧歪歪0x97860afc喇嘛喇嘛
喇嘛喇嘛{
布拉布拉
喇嘛喇嘛{
BLA
}
}
我大概可以算出它,但也许有一个简单的Perl或awk的解决方案,将节省1/2我一天。
从OP传输意见:
缩进可以用空格或制表符,我可以增强上任何建议的解决方案,我认为布莱恩总结得好:具体来说,你要被定义为文本块的开头项目进行排序含有0xNNNNNNNN一条线,并且包含一切最多(但不包括),其中包含一个0xNNNNNNNN下一行(此处的N的变化,当然)的。没有线穿插。
块引用>解决方案像这样的东西可能会奏效(未测试):
我的$行;
我的$的lastKey;
我%的数据;
而($行=<>){
的Chomp $线;
如果($行=〜/ \\ B(0X \\ p {} AHex {8})\\ B /){
#开始一个新项目
我的$ unique_key = $ 1 $; #名气为[布莱恩·杰拉德] [1]的唯一性
$数据{$ 1} = $行;
$ =的lastKey $ unique_key;
}其他{
#继续的旧条目
$数据{$}的lastKey = $行。
}
}打印$的数据{$ _},\\ n表示(排序{$ A< = GT; $ B}键%的数据);的问题是,你说,大的日志文件,所以在存储器中存储的文件可能是无效率的。但是,如果要排序的话,我怀疑你会需要做的。
如果在内存中存储不是一个选项,你可以永远只是将数据打印到文件,而不是与这将允许您通过其他方式对它进行排序的格式。
I have some huge log files I need to sort. All entries have a 32 bit hex number which is the sort key I want to use. some entries are one liners like
bla bla bla 0x97860afa bla bla
others are a bit more complex, start with the same type of line above and expand to a block of lines marked by curly brackets like the example below. In this case the entire block has to move to the position defined by the hex nbr. Block example-
bla bla bla 0x97860afc bla bla bla bla { blabla bla bla { bla } }
I can probably figure it out but maybe there is a simple perl or awk solution that will save me 1/2 day.
Transferring comments from OP:
Indentation can be space or tab, I can enhance that on any proposed solution, I think that Brian summarizes well: Specifically, do you want to sort "items" which are defined as a chunk of text that starts with a line containing a "0xNNNNNNNN", and contains everything up to (but not including) the next line which contains a "0xNNNNNNNN" (where the N's change, of course). No lines interspersed.
解决方案Something like this might work (Not tested):
my $line; my $lastkey; my %data; while($line = <>) { chomp $line; if ($line =~ /\b(0x\p{AHex}{8})\b/) { # Begin a new entry my $unique_key = $1 . $.; # cred to [Brian Gerard][1] for uniqueness $data{$1} = $line; $lastkey = $unique_key; } else { # Continue an old entry $data{$lastkey} .= $line; } } print $data{$_}, "\n" for (sort { $a <=> $b } keys %data);
The problem is that you said "huge" log files, so storing the file in memory will probably be inefficient. However, if you want to sort it, I suspect you're going to need to do that.
If storing in memory is not an option, you can always just print the data to a file instead, with a format that will allow you to sort it by some other means.
这篇关于perl的排序问题的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!