用awk逐行读取并解析变量 [英] read line by line with awk and parse variables

查看:634
本文介绍了用awk逐行读取并解析变量的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个脚本,可以读取日志文件并解析数据以将其插入到mysql表中..

I have a script that read log files and parse the data to insert them to mysql table..

我的脚本看起来像

while read x;do
var=$(echo ${x}|cut -d+ -f1) 
var2=$(echo ${x}|cut -d_ -f3)
...
echo "$var,$var2,.." >> mysql.infile 
done<logfile

问题在于日志文件成千上万行,并且要花费数小时....

The Problem is that log files are thousands of lines and taking hours....

我读到awk更好,我尝试过,但是不知道解析变量的语法...

I read that awk is better, I tried, but don't know the syntax to parse the variables...

输入是结构防火墙日志,因此它们是相当大的文件,例如

inputs are structure firewall logs so they are pretty large files like

@timestamp $ HOST原因=空闲超时" source-address ="x.x.x.x" source-port ="19219" destination-address ="x.x.x.x" destination-port ="53" service-name ="dns-udp" application ="DNS"....

@timestamp $HOST reason="idle Timeout" source-address="x.x.x.x" source-port="19219" destination-address="x.x.x.x" destination-port="53" service-name="dns-udp" application="DNS"....

所以我在grep上使用了很多grep来表示约60个变量,例如

So I'm using a lot of grep for ~60 variables e.g

sourceaddress=$(echo ${x}|grep -P -o '.{0,0} 
source-address=\".{0,50}'|cut -d\" -f2)

如果您认为perl会更好,我欢迎您提出建议,也可能会提示如何编写脚本...

if you think perl will be better I'm open to suggestions and maybe a hint how to script it...

推荐答案

为回答您的问题,我假设使用以下游戏规则:

To answer your question, I assume the following rules of the game:

  • 每行包含各种变量
  • 每个变量都可以通过不同的定界符找到.

这为您提供了以下awk脚本:

This gives you the following awk script :

awk 'BEGIN{OFS=","}
     { FS="+"; $0=$0; var=$1;
       FS="_"; $0=$0; var2=$3;
               ...
       print var1,var2,... >> "mysql.infile"
     }' logfile

它基本上执行以下操作:

It basically does the following :

  • 将输出分隔符设置为,
  • 阅读行
  • 将字段分隔符设置为+,重新分析行($0=$0)并确定第一个变量
  • 将字段分隔符设置为"_",重新解析行($0=$0)并确定第二个变量
  • ...继续所有变量
  • 将行打印到输出文件.
  • set the output separator to ,
  • read line
  • set the field separator to +, re-parse the line ($0=$0) and determine the first variable
  • set the field separator to '_', re-parse the line ($0=$0) and determine the second variable
  • ... continue for all variables
  • print the line to the output file.

这篇关于用awk逐行读取并解析变量的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆