用datetimes搜索日志文件 [英] Searching through a logfile with datetimes

查看:90
本文介绍了用datetimes搜索日志文件的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在从日志文件中读取,并希望将搜索限制在特定的日期范围内。日志文件中的行格式如下: May 27 09:33:33 。我已经从日志文件的每一行的文本中的其余部分分离出日期。我只想写一个这样的陈述

I am reading from a logfile and want the option to limit the search to a specific date range. The lines in the log file are in the following format May 27 09:33:33. I've already separated out the dates from the rest of the text in each line of the logfile. I just want to be able to write a statement like this

if(the date falls between June 10th and June 20th)

所以只是一个例子,我试图获得当前时间

So just as an example I am trying to get the current time

use DateTime;

my $dt   = DateTime->now;
my $date = $dt->md;  
my $time = $dt->hms;   

但是不会把它放在mm-dd的格式?

but wouldn't that put it in the format of mm-dd?

推荐答案

您应该使用时间戳记/时期进行比较。这是一个例子:

You should use timestamps / epochs for your comparisons. Here is an example:

#!/usr/bin/env perl                                                                         

use DateTime::Format::Strptime;
use DateTime;

my $year = DateTime->now->year;

my $date_parser = DateTime::Format::Strptime->new(
    pattern => '%Y %B %d', # YYYY Month DD
);

my $start_date = 'June 10';
my $end_date   = 'June 20';
my $start_epoch = $date_parser->parse_datetime("$year $start_date")
                              ->epoch();
my $end_epoch   = $date_parser->parse_datetime("$year $end_date")
                              ->add( days => 1 )
                              ->epoch(); # Add one to get next day                                                                

my $parser = DateTime::Format::Strptime->new(
    pattern => '%Y %b %d %T', # YYYY Mon DD HH:MM:SS                                        
);

print "Start Epoch : $start_epoch [ $start_date ]\n";
print "End   Epoch : $end_epoch [ $end_date ]\n";

for my $log_date ('May 27 09:33:33',
                  'Jun 05 09:33:33',
                  'Jun 10 09:33:33',
                  'Jun 20 09:33:33',
                  'Jun 30 09:33:33',) {
    my $epoch = $parser->parse_datetime("$year $log_date")->epoch();
    print "Log   Epoch : $epoch [ $log_date ]\n";
    if ( $start_epoch <= $epoch and $epoch < $end_epoch) {
        # Less than end_epoch (midnight) to match previous day                              
        print "==> Log Epoch is in range\n";
    }
}

输出以下内容:

Start Epoch : 1433894400 [ June 10 ]
End   Epoch : 1434844800 [ June 20 ]
Log   Epoch : 1432719213 [ May 27 09:33:33 ]
Log   Epoch : 1433496813 [ Jun 05 09:33:33 ]
Log   Epoch : 1433928813 [ Jun 10 09:33:33 ]
==> Log Epoch is in range
Log   Epoch : 1434792813 [ Jun 20 09:33:33 ]
==> Log Epoch is in range
Log   Epoch : 1435656813 [ Jun 30 09:33:33 ]

不使用核心库来计算纪元日期是不明智的,因为现在您需要担心自从unix出生日期(1970年1月1日)以来的日子,闰秒,闰秒,你会有这么多的边缘案例试图破坏你的乐趣有很多方法可以解决这个错误。但是有一个选择:

Calculating epoch dates without using a core library is unwise, because now you would need to worry about days since the unix birth date (jan 1, 1970), leap days, leap seconds, and you would have so many edge cases trying to spoil your fun. There are many ways of getting this wrong. However there is an alternative:

如果由于某些原因,您反对使用核心库模块,您可以通过将日期转换为规范表单来搜索日志文件然后只是选择落入范围内的日期。

If for some reason, you are opposed to using core library modules, you can search through a log file by converting the dates to a canonical form and then just selecting dates which fall into the range.

这是一样的例子,但不使用任何模块,而是使用规范化(规范)的日期:

Here is the same example, but without using any modules, but using normalized (canonical) dates instead:

#!/usr/bin/env perl

use strict;
use warnings;

my %months = ( jan => 1, feb => 2,  mar => 3,  apr => 4,
               may => 5, jun => 6,  jul => 7,  aug => 8,
               sep => 9, oct => 10, nov => 11, dec => 12 );

my $year = 2015; # TODO: what year is it? Need to worry about Dec/Jan rollover

my @log_dates = (
    'May 27 09:33:33',
    'Jun 05 09:33:33',
    'Jun 10 09:33:33',
    'Jun 20 09:33:33',
    'Jun 30 09:33:33',
);

my $start_date = 'June 10';
my $end_date   = 'June 20';
my $start_canonical = canonical_date_for_mmmdd_hhmmss("$year $start_date 00:00:00");
my $end_canonical   = canonical_date_for_mmmdd_hhmmss("$year $end_date 23:59:59");

for my $log_date (@log_dates) {
    my $canonical_date = canonical_date_for_mmmdd_hhmmss("$year $log_date");
    print "Log Canonical Date : $canonical_date [ $log_date ]\n";
    if ($start_canonical <= $canonical_date and
        $canonical_date  <= $end_canonical) {
        print "===> Date in range\n";
    }
}

sub canonical_date_for_mmmdd_hhmmss {
    my ($datestr) = @_;
    my ($year, $mon, $day, $hr, $min, $sec) =
        $datestr =~ m|^(\d+)\s+(\w+)\s+(\d+)\s+(\d+):(\d+):(\d+)$|; # YYYY Month DD HH:MM:SS
    $year > 1900
        or die "Unable to handle year '$year'";
    my $month_first_three = lc( substr($mon,0,3) );
    my $month_num = $months{$month_first_three};
    defined $month_num
        or die "Unable to handle month '$mon'";
    (1 <= $day and $day <= 31)
        or die "Unable to handle day '$day'";
    (0 <= $hr and $hr <= 23)
        or die "Unable to handle hour '$hr'";
    (0 <= $min and $min <= 59)
        or die "Unable to handle minute '$min'";
    (0 <= $sec and $sec <= 59)
        or die "Unable to handle second '$sec'";
    my $fmt = "%04d%02d%02d%02d%02d%02d"; # YYYYMMDDHHMMSS
    return sprintf($fmt, $year, $month_num, $day, $hr, $min, $sec);
}

哪些输出如下:

Log Canonical Date : 20150527093333 [ May 27 09:33:33 ]
Log Canonical Date : 20150605093333 [ Jun 05 09:33:33 ]
Log Canonical Date : 20150610093333 [ Jun 10 09:33:33 ]
===> Date in range
Log Canonical Date : 20150620093333 [ Jun 20 09:33:33 ]
===> Date in range
Log Canonical Date : 20150630093333 [ Jun 30 09:33:33 ]

另见其他属性的 ISO 8601 数据元素和交换格式)使用归一化/规范时间戳。

See also ISO 8601 (Data elements and interchange formats) for other properties of using a normalized / canonical timestamp.

这篇关于用datetimes搜索日志文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆