解析日期文件中包含日期时间大于某些内容的行 [英] Parsing lines from a log file containing date-time greater than something
问题描述
我有大小为几百MB大小的日志文件,包含这样的行,包含开头的日期 - 时间信息:
[Tue Oct 4 11:55:19 2016] [hphp] [25376:7f5d57bff700:279809:000001] [] \\\
pre>
Fatal error:syntax error,unexpected T_ENCAPSED_AND_WHITESPACE,expected')'in / var / cake_1.2.0.6311-beta / app / webroot / openx / www / delivery / postGetAd.php(12479)(62110d90541a84df30dd077ee953e47c):第1行的eval()'代码
我有一个插件(nagios check_logwarn)只打印出包含一些错误字符串的行。以下是运行它的命令:
/ usr / local / nagios / libexec / check_logwarn -d / tmp / logwarn -p /mnt/log/hiphop/error_20161003.log^。*致命错误*
我想根据日期时间进一步过滤,即所有在11:55:10之后的行。
我不知道是否使用正则表达式。以下是我到目前为止:
pre $ us $ local / nagios / libexec / check_logwarn -d / tmp / logwarn -p /mnt/log/hiphop/error_20161003.log^。*致命错误*| grep15 \:19 \:1 *
但这只会过滤那些日志时间是在15小时的第19分钟。
更新
现在我可以比较日期的时间部分-时间。
/ usr / local / nagios / libexec / check_logwarn -d / tmp / logwarn -p / mnt / log / hiphop / error_20161004 .log^。*致命错误*| awk'$ 4> 14:22:11'
我如何比较每日部分?
更新2 - 打开赏金
我不得不开一个赏金,因为我没有太多我很快就需要一个解决方案。
我被困在比较日期的部分。通过解决方案 https://stackoverflow.com/a/39856560/351903 ,我正面临这个问题。如果这是固定的,我会很高兴。
我也对此进行了一些改进(我不介意输出的日志顺序是否混乱) -
/ usr / local / nagios / libexec / check_logwarn -d / tmp / logwarn -p /mnt/log/hiphop/error_20161004.log^。*致命错误*| awk'$ 4> 14:22:11'
我寻找一些日期时间来比较时间戳,找不到工作。
我无法从这个问题。我看不到使用此时间戳记的价值 - $ / b>
echo date -d '06 / 12/2012 07:21:22'+ %s
不知道我错过了什么。
它使用参考时间戳并比较日志文件的时间戳与它;如果日志文件的时间戳更近,则打印该行:
awk -v refdate =$(date +'%s'-d'Mon Oct 3 10:00:00 2016')-F[] []'
{
cmd =date + \ ((cmd | getline val)> 0){
if(val> refdate)
打印
}
关闭(cmd)
}
'infile
以下是它的工作方式: I have log files of size of the order of several 100 MBs, containing lines like this, containing the date-time information in the beginning: I have a plugin (nagios check_logwarn) to print out only those lines which contain some of the error strings. Following is the command to run it: I want to filter out further, based on the date-time, i.e., all the lines which are after, say, 11:55:10. I am not sure whether to use regex for this. Following is what I have so far: But this will only filter those logs whose time is in the 19th minute of the 15th hour. Update I am now able to compare the time part of the date-time. How do I compare the day part? Update 2 - opening bounty I am having to open a bounty because I do not have much expertise with shell and I need a solution soon. I am stuck at the part of comparing the dates. With The solution https://stackoverflow.com/a/39856560/351903, I am facing this problem. If that is fixed, I would be happy. I am also open to some enhancement to this (I don't mind if the output has some jumbled up order of logs) - I looked for some date-time to timestamp comparison, but couldn't find something working. I am not able to proceed from what is given in this question. I cannot see the timestamp value using this - Not sure what am I missing. This uses a reference timestamp and compares the timestamp from the log file to it; if the log file's time stamp is more recent, the line gets printed: Here is how it works: References 这篇关于解析日期文件中包含日期时间大于某些内容的行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
-v refdate =$(date +'%s '-d'Mon Oct 3 10:00:00 2016')
将给定日期(我们的引用日期)转换为自纪元以来的秒数。 code> -F[] []将字段分隔符设置为方括号,因此我们需要的时间戳只是 $ 2
。
date + \047%s \047 -d \$ 2\
是shell我们想要执行的命令;它变成 date +'%s'-d$ 2
,即它将日志文件时间戳转换为自纪元以来的秒数。 \047
是一个单引号。
命令| getline val
评估命令
并将结果赋给 val
,所以 val
现在保存自纪元以来的日志文件中的时间戳,以秒为单位。
(cmd | getline val)检查
。 getline
> 0
getline
if(val> refdate)print
将日志文件时间戳与参考日期进行比较,如果日志文件时间戳更近,则打印该行。
$关闭(cmd)参考
$ b
date -d
非常灵活,了解日期字符串中的很多格式,请参阅 date
手册。 getline
)
[Tue Oct 4 11:55:19 2016] [hphp] [25376:7f5d57bff700:279809:000001] [] \nFatal error: syntax error, unexpected T_ENCAPSED_AND_WHITESPACE, expecting ')' in /var/cake_1.2.0.6311-beta/app/webroot/openx/www/delivery/postGetAd.php(12479)(62110d90541a84df30dd077ee953e47c) : eval()'d code on line 1
/usr/local/nagios/libexec/check_logwarn -d /tmp/logwarn -p /mnt/log/hiphop/error_20161003.log "^.*Fatal error*"
/usr/local/nagios/libexec/check_logwarn -d /tmp/logwarn -p /mnt/log/hiphop/error_20161003.log "^.*Fatal error*" | grep "15\:19\:1*"
/usr/local/nagios/libexec/check_logwarn -d /tmp/logwarn -p /mnt/log/hiphop/error_20161004.log "^.*Fatal error*" | awk '$4 > "14:22:11"'
/usr/local/nagios/libexec/check_logwarn -d /tmp/logwarn -p /mnt/log/hiphop/error_20161004.log "^.*Fatal error*" | awk '$4 > "14:22:11"'
echo date -d '06/12/2012 07:21:22' +"%s"
awk -v refdate="$(date +'%s' -d 'Mon Oct 3 10:00:00 2016')" -F "[][]" '
{
cmd = "date +\047%s\047 -d \"" $2 "\""
if ((cmd | getline val) > 0) {
if (val > refdate)
print
}
close(cmd)
}
' infile
-v refdate="$(date +'%s' -d 'Mon Oct 3 10:00:00 2016')"
converts the date given (our reference date) to seconds since the epoch.-F "[][]"
sets the field separator to square brackets, so the timestamp we want is simply $2
."date +\047%s\047 -d \"" $2 "\""
is the shell command we'd like to execute; it becomes date +'%s' -d "$2"
, i.e., it converts the log file timestamp to seconds since the epoch. \047
is a single quote.command | getline val
evaluates command
and assigns the result to val
, so val
now holds the timestamp from the log file in seconds since the epoch.
getline
with (cmd | getline val) > 0
.getline
was successful, if (val > refdate) print
compares the log file timestamp to the reference date and, if the log file timestamp is more recent, prints the line.close(cmd)
closes the pipeline.
date -d
is very flexible and understands a lot of formats in the date string, see the date
manual.getline
in the gawk user manual and on freeshell.org (hat tip Ed Morton, who also pointed out how to properly use getline
in his helpful comment)