当文件数量巨大时,如何提高Perl中的grep效率 [英] how to improve grep efficiency in perl when the file number is huge

查看:84
本文介绍了当文件数量巨大时,如何提高Perl中的grep效率的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想用到grep从位于以下目录结构中的日志文件一些日志信息的Perl:
$ jobDir / jobXXXX / host.log 其中 XXXX 是一个工作号码,从1到几千。在 $ jobDir 下没有其他类型的子目录,除了 jobXXXX 下的日志外,没有其他文件。该脚本是:

  my @Info; #存储日志信息
my $ Num = 0;
@Info = qx(grep信息-r $ jobDir); #这个可以吗 ?

的foreach(@Info){
如果($ _ =〜/\((\d+)\)(.*)\((\d+)\ )/){
输出(xxxxxxxx);
}
$ Num = $ Num + 1; #number count
}

我们发现,当工作号码是几千时,此脚本将花费很长时间输出信息。



有什么办法可以提高效率吗?

谢谢!


< DIV类=h2_lin>解决方案

您应该搜索那些日志文件一个接一个,并扫描由线每个日志文件中的行,而不是读取的grep 到内存(这可能会花费大量内存,并减慢程序,甚至是系统):

 #未经测试的脚本

my $ Num;
的foreach我的$日志(小于$ jobDir /工作* / host.log>){
打开我的$ logfh, '<', $日志 或死无法打开$日志:$ !;
而(小于$ logfh>)。{
如果(M /信息/){
如果(M / \((\d +)\)(*)\ ((\d +)\)/){
输出(xxx);
}
$ Num ++;
}
}
close $ logfh;
}


I want to grep some log information from the log files located in the following directory structure using perl: $jobDir/jobXXXX/host.log where XXXX is a job number, from 1 to a few thousands. There's no other kinds of sub directory under $jobDir and no other files except logs under jobXXXX. The script is :

my  @Info;  #store the log informaiton
my $Num = 0;
@Info = qx(grep "information" -r $jobDir); #is this OK ?

foreach(@Info){
        if($_=~ /\((\d+)\)(.*)\((\d+)\)/){
            Output(xxxxxxxx);   
        }
        $Num=$Num+1; #number count      
    }

It is found that when then job number is a few thousands, this script will take very long time to output the information.

Is there any way to improve its efficiency?

Thanks!

解决方案

You should search those log file one by one, and scan each log file line by line, instead of reading the output of grep to memory (that could cost lots of memory, and slow your program, even your system):

# untested script

my $Num;
foreach my $log (<$jobDir/job*/host.log>) {
    open my $logfh, '<', "$log" or die "Cannot open $log: $!";
    while (<$logfh>) {
        if (m/information/) {
            if(m/\((\d+)\)(.*)\((\d+)\)/) {
                Output(xxx);
            }
            $Num++;
        }
    }
    close $logfh;
}

这篇关于当文件数量巨大时,如何提高Perl中的grep效率的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆