如何使用AWK regExp在不同列中以excel格式打印多个子字符串模式 [英] How to use AWK regExp to print multiple substring pattern in a excel format in different column

查看:233
本文介绍了如何使用AWK regExp在不同列中以excel格式打印多个子字符串模式的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个包含数百万行这样的日志文件:

I have a log file which contains millions line like this:

$ cat file.log
10.0.7.92 - - [05/Jun/2017:03:50:06 +0000] "GET /adserver/html5/inwapads/?category=[IAB]&size=320x280&ak=AY1234&output=vast&version=1.1&sleepAfter=&requester=John&adFormat=preappvideo HTTP/1.1" 200 131 "-" "Mozilla/5.0 (Linux; Android 6.0.1; SM-S120VL Build/MMB29M; wv) AppleWebKit/537.36 (KHTML, like Gecko) Version/4.0 Chrome/58.0.3029.83 Mobile Safari/537.36" 0.000 1029 520 127.0.0.1
10.0.6.91 - - [05/Jun/2017:03:50:06 +0000] "GET /adserver/html5/inwapads/?category=[IAB]&output=vast&version=1.1&sleepAfter=&requester=John&size=320x280&ak=AY1234&adFormat=preappvideo HTTP/1.1" 200 131 "-" "Mozilla/5.0 (Linux; Android 6.0.1; SM-S120VL Build/MMB29M; wv) AppleWebKit/537.36 (KHTML, like Gecko) Version/4.0 Chrome/58.0.3029.83 Mobile Safari/537.36" 0.000 1029 520 127.0.0.1

我想要打印输出o f每行都像这样在Excel中有不同的列:

I want print output of every line like this in excel with different columns:

inwapads    AY1234  john    320x280

如何使用 awk 或是否需要使用其他方法。

How to do that use awk or do I need to use another method.

推荐答案

如果您想要的输入看起来像文件数据: $ b

If your desired Input looks like the file data:

$ cat file.log
10.0.7.92 - - [05/Jun/2017:03:50:06 +0000] "GET /adserver/html5/inwapads/?category=[IAB]&size=320x280&ak=AY1234&output=vast&version=1.1&sleepAfter=&requester=John&adFormat=preappvideo HTTP/1.1" 200 131 "-" "Mozilla/5.0 (Linux; Android 6.0.1; SM-S120VL Build/MMB29M; wv) AppleWebKit/537.36 (KHTML, like Gecko) Version/4.0 Chrome/58.0.3029.83 Mobile Safari/537.36" 0.000 1029 520 127.0.0.1
10.0.6.91 - - [05/Jun/2017:03:50:06 +0000] "GET /adserver/html5/inwapads/?category=[IAB]&output=vast&version=1.1&sleepAfter=&requester=John&size=320x280&ak=AY1234&adFormat=preappvideo HTTP/1.1" 200 131 "-" "Mozilla/5.0 (Linux; Android 6.0.1; SM-S120VL Build/MMB29M; wv) AppleWebKit/537.36 (KHTML, like Gecko) Version/4.0 Chrome/58.0.3029.83 Mobile Safari/537.36" 0.000 1029 520 127.0.0.1

然后您可以简单地使用 awk 使用一些 gensub(/ regex /,substitution,n,column)来处理专栏 $ 7 awk 的一般替代工具

Then you can simply use awk working on column $7 with some gensub( /regex/, substitution, n, column), awk's general substitution tool

$ awk '{
    item=gensub( /(^.*\/)(.*\/)(.*)(\/)(\?.*$)/ , "\\3" , 1, $7 )
    ak=gensub( /(^.*ak\=)([A-Z]*[0-9]*)(\&)(.*$)/ , "\\2" , 1, $7)
    req=gensub( /(^.*requester\=)([A-Za-z]*)(\&)(.*$)/ , "\\2", 1, $7)
    s=gensub( /(^.*size\=)([0-9]*x[0-9]*)(\&.*$)/, "\\2", 1,  $7)
    print item, ak, req, s
}' file.log

输出:

inwapads AY1234 John 320x280
inwapads AY1234 John 320x280

这篇关于如何使用AWK regExp在不同列中以excel格式打印多个子字符串模式的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆