在Nginx日志中解析时间戳 [英] Parsing timestamp in nginx logs

查看：634 发布时间：2020/5/17 21:55:12 regex bash nginx

本文介绍了在Nginx日志中解析时间戳的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我需要帮助，因为我是日志解析的新手.我正在尝试提取状态为200的所有日志行，并在15:35之前加上15个小时的时间戳.我无法弄清楚要使用的正则表达式.

I need help as I am new to log parsing. I'm trying to extract all log lines that have a 200 status, with a timestamp of 15 hours before 15:35. I am not able to figure out the regex to be used.

以下是日志示例:

198.104.78.160 [26/Dec/2016:15:24:12 -0500] 200 190.50.175.65:8080 200 testtest.com GET/api/bid_request?feed=1&auth=qwerty&ip=85.194.119.3 & ua = Mozilla％2F5.0 +％28Windows + NT + 6.1％3B + Win64％3B + x64％29 + AppleWebKit％2F537.36 +％28KHTML％2C + like + Gecko％29 + Chrome％2F48.0.2564. 97 + Safari％2F537.36& lang = tr-TR％2Ctr％3Bq％3D0.8％2Cen-US％3Bq％3D0.6％2Cen％3Bq％3D0.4& ref = http％3A％2F％2Fserve. pop.net％2Fs HTTP/1.0---174.194.36.141-0.109-0.009美国/

198.104.78.160 [26/Dec/2016:15:24:12 -0500] 200 190.50.175.65:8080 200 testtest.com GET /api/bid_request?feed=1&auth=qwerty&ip=85.194.119.3&ua=Mozilla%2F5.0+%28Windows+NT+6.1%3B+Win64%3B+x64%29+AppleWebKit%2F537.36+%28KHTML%2C+like+Gecko%29+Chrome%2F48.0.2564.97+Safari%2F537.36&lang=tr-TR%2Ctr%3Bq%3D0.8%2Cen-US%3Bq%3D0.6%2Cen%3Bq%3D0.4&ref=http%3A%2F%2Fserve.pop.net%2Fs HTTP/1.0 - - - 174.194.36.141 - 0.109-0.009 US /

推荐答案

您可以使用awk来做到这一点:

You can use awk to do that :

awk -v status_code=200 -v ts_at_hour=15 -v ts_before_hour=15 -v ts_before_min=35 '

    {
        match($0, /[0-9]+\.[0-9]+\.[0-9]+\.[0-9]+\s+\[[0-9]{2}\/[a-zA-Z]{3}\/[0-9]{4}:([0-9]{2}):([0-9]{2}):([0-9]{2})\s+[+-][0-9]{4}\]\s+([0-9]{3})/, items)

        if (items[1] == ts_at_hour && 
            items[1] <= ts_before_hour && 
            items[2] < ts_before_min &&
            items[4] == status_code){
          print $0
        }
    }
' data.txt

设置一些变量来存储您的需求status_code，ts_at_hour，ts_before_hour和ts_before_min(您可以为其定义环境变量)

Set some variables to store your requirements status_code, ts_at_hour, ts_before_hour and ts_before_min (you can define environment vars to them)

正则表达式是match，专注于4个组:小时，分钟，由([0-9]{2})定义的秒和位于末尾([0-9]{3})

The regex is a match that focus on 4 groups : hour, minutes, seconds defined by ([0-9]{2}) and status_code at the end ([0-9]{3})

要分解正则表达式，您可以:

To decompose the regex, you have :

IP地址[0-9]+\.[0-9]+\.[0-9]+\.[0-9]+，后跟空格\s+(或更多)
包含小时，分钟和秒的日期部分\[[0-9]{2}\/[a-zA-Z]{3}\/[0-9]{4}:([0-9]{2}):([0-9]{2}):([0-9]{2})\s+[+-][0-9]{4}\](注意()之间的3组)
带有([0-9]{3})

the IP address [0-9]+\.[0-9]+\.[0-9]+\.[0-9]+ followed by space \s+ (or more)
the date part which includes hour,minutes and seconds \[[0-9]{2}\/[a-zA-Z]{3}\/[0-9]{4}:([0-9]{2}):([0-9]{2}):([0-9]{2})\s+[+-][0-9]{4}\] (notice the 3 groups between ())
the status code with ([0-9]{3})

这篇关于在Nginx日志中解析时间戳的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

在Nginx日志中解析时间戳 [英] Parsing timestamp in nginx logs

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

在Nginx日志中解析时间戳 [英] Parsing timestamp in nginx logs

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭