从Shell中的文件中提取字段 [英] extracting fields from a file in Shell

查看:447
本文介绍了从Shell中的文件中提取字段的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在一个文件中有很多数据,如下所示

I have lots of data in a file like as below

 alert tcp any any -> any any (msg: "test1"; sid:16521; rev:1;created_at 2010_07_30, updated_at 2016_07_01;)
 alert tcp any any -> any any (msg: "test2"; nocase; sid :23476;distance:0; rev:1;created_at 2010_10_30, updated_at 2013_07_11;)
 alert tcp any any -> any any (msg: "test3"; sid:236487; file_data; content:"clsid"; nocase; distance:0; created_at 2008_08_03, updated_at 2016_05_01;

我想从文件&中提取sid,msg,created_at和updated_at.输出应类似于

I want to extract sid,msg,created_at and updated_at from the file & output should looks like

test1 | 16521 | 2010_07_30 | 2016_07_01
test2 | 23476 | 2010_10_30 | 2013_07_11
test3 | 236487| 2008_08_03 | 2016_05_01

而我使用的脚本是

cat $file  | grep -v "^#" | grep "^alert" | sed 's/\"//g' | awk -F ';' '
{
        for(i=1;i<=NF;i++)
    {
            if (match($i,"sid:")>0)
        {
            split($i, array1, ":")
            Rule_sid=array1[2]
                    }
            if(match($i,"msg:")>0)
                    {
                        split($i, array, "(")
            split(array[2], array2, ":")
                        message=array2[2]
                    }
            if(match($i,/metadata:/)>0 )
        {
            split($i, array3,/created_at/)
            create_date=array3[2]
        }
            if(match($i,/metadata:/)>0 )
                    {
                        split($i, array4, ", updated_at ")
            update_date=array4[2]
                    }


           }
            print Rule_sid "|" message "|" create_date "|" update_date
    }' >> Rule_Files/$file

推荐答案

使用awk

根据您的兴趣修改-v OFS=" | "-v extract="msg,sid,created_at,updated_at"OFS是输出字段分隔符,变量extract保存需要解析的字段列表(用逗号分隔),如果找不到任何字段,它将被解析给出Null

Modify -v OFS=" | " and -v extract="msg,sid,created_at,updated_at" as per your interest, OFS is output field separator and variable extract holds list of fields (separated by comma) which need to be parsed, if any field not found it will give Null

程序假定字段值存在于当前字段匹配项旁边,假设在j=4时找到了字段sid,其值存在于j+1,即j=5.

Program assumes field value exists next to current field match, suppose field sid found when j=4, its value exists at j+1 that is at j=5.

输入

$ cat file
 alert tcp any any -> any any (msg: "test1"; sid:16521; rev:1;created_at 2010_07_30, updated_at 2016_07_01;)
 alert tcp any any -> any any (msg: "test2"; nocase; sid :23476;distance:0; rev:1;created_at 2010_10_30, updated_at 2013_07_11;)
 alert tcp any any -> any any (msg: "test3"; sid:236487; file_data; content:"clsid"; nocase; distance:0; created_at 2008_08_03, updated_at 2016_05_01;)

输出

$ awk -v OFS=" | " -v extract="msg,sid,created_at,updated_at" '
 BEGIN{
    split(extract,Fields,/,/)
 }
 {
    gsub(/[:";,()]/," "); 
    s=""; 
    for(i=1; i in Fields; i++)
    { 
        f = 1
        for(j=1; j<=NF; j++)
        { 
            if($j==Fields[i])
            {
              f = 0 
              s = ( s ? s OFS :"") $(j+1) 
              break 
            } 
        }
        if(f){
            s = (s ? s OFS:"") "Null"
        }     
     } 
        print s 
 }' file

test1 | 16521 | 2010_07_30 | 2016_07_01
test2 | 23476 | 2010_10_30 | 2013_07_11
test3 | 236487 | 2008_08_03 | 2016_05_01

这篇关于从Shell中的文件中提取字段的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆