从包含日期时间大于某事的日志文件中解析行 [英] Parsing lines from a log file containing date-time greater than something

查看:14
本文介绍了从包含日期时间大于某事的日志文件中解析行的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有大约 100 MB 大小的日志文件,包含这样的行,开头包含日期时间信息:

I have log files of size of the order of several 100 MBs, containing lines like this, containing the date-time information in the beginning:

[Tue Oct  4 11:55:19 2016] [hphp] [25376:7f5d57bff700:279809:000001] [] 
Fatal error: syntax error, unexpected T_ENCAPSED_AND_WHITESPACE, expecting ')' in /var/cake_1.2.0.6311-beta/app/webroot/openx/www/delivery/postGetAd.php(12479)(62110d90541a84df30dd077ee953e47c) : eval()'d code on line 1

我有一个插件 (nagios check_logwarn) 可以只打印出那些包含一些错误字符串的行.以下是运行它的命令:

I have a plugin (nagios check_logwarn) to print out only those lines which contain some of the error strings. Following is the command to run it:

/usr/local/nagios/libexec/check_logwarn -d /tmp/logwarn -p /mnt/log/hiphop/error_20161003.log "^.*Fatal error*" 

我想根据日期时间进一步过滤掉,即 11:55:10 之后的所有行.

I want to filter out further, based on the date-time, i.e., all the lines which are after, say, 11:55:10.

我不确定是否为此使用正则表达式.以下是我目前所拥有的:

I am not sure whether to use regex for this. Following is what I have so far:

/usr/local/nagios/libexec/check_logwarn -d /tmp/logwarn -p /mnt/log/hiphop/error_20161003.log "^.*Fatal error*" | grep "15:19:1*"

但这只会过滤那些时间在第 15 小时的第 19 分钟的日志.

But this will only filter those logs whose time is in the 19th minute of the 15th hour.

更新

我现在可以比较日期时间的时间部分.

I am now able to compare the time part of the date-time.

/usr/local/nagios/libexec/check_logwarn -d /tmp/logwarn -p /mnt/log/hiphop/error_20161004.log "^.*Fatal error*" | awk '$4 > "14:22:11"'

我如何比较当天的部分?

How do I compare the day part?

更新 2 - 开放赏金

我不得不开启赏金计划,因为我对 shell 没有太多专业知识,我很快需要一个解决方案.

I am having to open a bounty because I do not have much expertise with shell and I need a solution soon.

我被困在比较日期的部分.有了解决方案https://stackoverflow.com/a/39856560/351903,我面临着这个问题.如果那是固定的,我会很高兴.

I am stuck at the part of comparing the dates. With The solution https://stackoverflow.com/a/39856560/351903, I am facing this problem. If that is fixed, I would be happy.

我也愿意对此进行一些增强(我不介意输出是否有一些混乱的日志顺序)-

I am also open to some enhancement to this (I don't mind if the output has some jumbled up order of logs) -

/usr/local/nagios/libexec/check_logwarn -d /tmp/logwarn -p /mnt/log/hiphop/error_20161004.log "^.*Fatal error*" | awk '$4 > "14:22:11"'

我查找了一些日期时间与时间戳的比较,但找不到有效的方法.

I looked for some date-time to timestamp comparison, but couldn't find something working.

我无法从 这个问题.我看不到使用这个的时间戳值 -

I am not able to proceed from what is given in this question. I cannot see the timestamp value using this -

echo date -d '06/12/2012 07:21:22' +"%s"

不确定我错过了什么.

推荐答案

这使用了一个引用时间戳并将日志文件中的时间戳与其进行比较;如果日志文件的时间戳较新,则打印该行:

This uses a reference timestamp and compares the timestamp from the log file to it; if the log file's time stamp is more recent, the line gets printed:

awk -v refdate="$(date +'%s' -d 'Mon Oct 3 10:00:00 2016')" -F "[][]" '
    {
        cmd = "date +47%s47 -d "" $2 """
        if ((cmd | getline val) > 0) {
            if (val > refdate)
                print
        }
        close(cmd)
    }
' infile

这是它的工作原理:

  • -v refdate=$(date +'%s' -d 'Mon Oct 3 10:00:00 2016')" 将给定的日期(我们的参考日期)转换为自纪元以来的秒数.
  • -F "[][]" 将字段分隔符设置为方括号,因此我们想要的时间戳只是 $2.
  • "date +47%s47 -d ""$2 """ 是我们要执行的shell命令;它变成 date +'%s' -d "$2",即,它将日志文件时间戳转换为自纪元以来的秒数.47 是单引号.
  • cmd |getline val 计算 cmd 并将结果分配给 val,所以 val 现在保存日志文件中的时间戳,以秒为单位时代.
  • 我们用 (cmd | getline val) > 检查 getline 是否成功0.
  • 如果 getline 成功,if (val > refdate) print 将日志文件时间戳与参考日期进行比较,如果日志文件时间戳较新,打印该行.
  • close(cmd) 关闭管道.
  • -v refdate="$(date +'%s' -d 'Mon Oct 3 10:00:00 2016')" converts the date given (our reference date) to seconds since the epoch.
  • -F "[][]" sets the field separator to square brackets, so the timestamp we want is simply $2.
  • "date +47%s47 -d "" $2 """ is the shell command we'd like to execute; it becomes date +'%s' -d "$2", i.e., it converts the log file timestamp to seconds since the epoch. 47 is a single quote.
  • cmd | getline val evaluates cmd and assigns the result to val, so val now holds the timestamp from the log file in seconds since the epoch.
  • We check the success of getline with (cmd | getline val) > 0.
  • If getline was successful, if (val > refdate) print compares the log file timestamp to the reference date and, if the log file timestamp is more recent, prints the line.
  • close(cmd) closes the pipeline.

参考资料

  • date -d is very flexible and understands a lot of formats in the date string, see the date manual.
  • getline in the gawk user manual and on freeshell.org (hat tip Ed Morton, who also pointed out how to properly use getline in his helpful comment)

这篇关于从包含日期时间大于某事的日志文件中解析行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆