从python中的日志文件解析 [英] Parse from log file in python

查看:119
本文介绍了从python中的日志文件解析的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个日志文件,其中包含任意行和json字符串.我只需要从日志文件中提取一个json数据,但只能在'_____GP D_____'之后.我不希望文件中有其他任何行或json数据.

I have a log file with arbitrary number of lines and json strings. All I need is to extract is one json data from the log file BUT ONLY AFTER '_____GP D_____'. I do not want any other lines or json data from the file.

这是我的输入文件的外观

This is how my input file looks

INFO:modules.gp.helpers.parameter_getter:_____GP D_____
{'from_time': '2017-07-12 19:57', 'to_time': '2017-07-12 20:57', 'consig_number': 'dup1', 'text': 'r155', 'mobile': None, 'email': None}
ERROR:modules.common.actionexception:ActionError: [{'other': 'your request already crossed threshold time'}]
{'from_time': '2016-07-12 16:57', 'to_time': '2016-07-12 22:57', 'consig_number': 'dup2', 'text': 'r15', 'mobile': None, 'email': None}

我如何仅在'_____GP D_____'之后找到json字符串?

how do i find the json string only after '_____GP D_____'?

推荐答案

您可以逐行读取文件,直到在行尾遇到_____GP D_____为止,并且当您接起下一行时:

You can read your file line by line until you encounter _____GP D_____ at the end of the line, and when you do pick up just the next line:

found_json = None
with open("input.log", "r") as f:  # open your log file
    for line in f:  # read it line by line
        if line.rstrip()[-14:] == "_____GP D_____":  # if a line ends with our string...
            found_json = next(f).rstrip()  # grab the next line
            break  # stop reading of the file, nothing more of interest

然后您就可以使用found_json进行任何操作,包括解析,打印等.

Then you can do with your found_json whatever you want, including parsing it, printing it, etc.

更新-如果要连续跟踪"日志文件(类似于tail -f命令),可以在读取模式下打开它,并在逐行读取文件的同时保持文件句柄处于打开状态读取之间添加了合理的延迟的行(也很大程度上也是tail -f的方式)-然后,您可以使用相同的过程来发现所需的行何时发生,并捕获要处理的下一行,发送给其他进程或执行无论您打算怎么做.像这样:

UPDATE - If you want to continuously 'follow' your log file (akin to the tail -f command) you can open it in read mode and keep the file handle open while reading it line by line with a reasonable delay added between reads (that's largely how tail -f does it, too) - then you can use the same procedure to discover when your desired line occurs and capture the next line to process, send to some other process or do whatever you plan to do with it. Something like:

import time

capture = False  # a flag to use to signal the capture of the next line
found_lines = []  # a list to store our found lines, just as an example
with open("input.log", "r") as f:  # open the file for reading...
    while True:  # loop indefinitely
        line = f.readline()  # grab a line from the file
        if line != '':  # if there is some content on the current line...
            if capture:  # capture the current line
                found_lines.append(line.rstrip())  # store the found line
                # instead, you can do whatever you want with the captured line
                # i.e. to print it: print("Found: {}".format(line.rstrip()))
                capture = False  # reset the capture flag
            elif line.rstrip()[-14:] == "_____GP D_____":  # if it ends in '_____GP D_____'..
                capture = True  # signal that the next line should be captured
        else:  # an empty buffer encountered, most probably EOF...
            time.sleep(1)  # ... let's wait for a second before attempting to read again...

这篇关于从python中的日志文件解析的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆