使用awk解析日志行 [英] Parsing log lines using awk

查看:87
本文介绍了使用awk解析日志行的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我必须从大型日志文件行中解析一些信息. 就像

I have to parse some information out of big log file lines. Its something like

abc.log:2012-03-03 11:12:12,457 ABC[123.RPH.-101] XYZ: Query=get_data @a=0,@b=1 Rows=10Time=100   

在日志文件中有很多类似上面的日志行.我需要提取信息 日期时间,即2012-03-03 11:12:12,457 职位详细信息,即123.RPH.-101 查询即get_data(无参数) 行,即10 时间,即100

There are many log lines like above in the logfiles. I need to extract information like datetime i.e. 2012-03-03 11:12:12,457 job details i.e. 123.RPH.-101 Query i.e. get_data (no parameters) Rows i.e. 10 Time i.e. 100

所以输出应该看起来像

2012-03-03 11:12:12,457|123|-101|get_data|10|100  

我用awk尝试了各种置换计算,但没有正确完成.

I have tried various permutation computations with awk but not getting it right.

推荐答案

TXR:

@(collect :vars ())
@file:@year-@mon-@day @hh:@mm:@ss,@ms @jobname[@job1.RPH.@job2] @queryname: Query=@query @params Rows=@{rows /[0-9]+/}Time=@time
@(output)
@year-@mon-@day @hh-@mm-@ss,@ms|@job1|@job2|@query|@rows|@time
@(end)
@(end)

运行:

$ txr data.txr data.log
2012-03-03 11-12-12,457|123|-101|get_data|10|100

这是使程序断言日志文件中的每一行都必须与模式匹配的一种方法.首先,不要在收藏中留下空白.这意味着不匹配的材料不能被跳过来寻找匹配的行:

Here is one way to make the program assert that every line in the log file must match the pattern. First, do not allow gaps in the collection. This means that nonmatching material cannot be skipped to just look for the lines which match:

@(collect :gap 0 :vars ())

第二,在脚本末尾添加以下内容:

Secondly, at the end of the script we add this:

@(eof)

这在文件末尾指定一个匹配项.如果@(collect)由于行不匹配(由于:gap 0约束)而提前失败,则@(eof)将失败,因此脚本将以失败状态终止.

This specifies a match on the end of file. If the @(collect) bails early because of a nonmatching line (due to the :gap 0 constraint), the @(eof) will fail and so the script will terminate with a failed status.

在这种类型的任务中,字段拆分正则表达式黑客会适得其反,因为它们会盲目地为正在处理的输入的某些子集产生不正确的结果.如果输入包含大量行,则没有简单的方法来检查错误.最好进行一个非常具体的匹配,该匹配很可能会拒绝任何与模式所基于的示例不相似的内容.

In this type of task, field splitting regex hacks will backfire because they can blindly produce incorrect results for some subset of the input being processed. If the input contains a vast number of lines, there is no easy way to check for mistakes. It's best to have a very specific match that is likely to reject anything which doesn't resemble the examples on which the pattern is based.

这篇关于使用awk解析日志行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆