从Shell中的文件中提取字段 [英] extracting fields from a file in Shell
问题描述
我在一个文件中有很多数据,如下所示
I have lots of data in a file like as below
alert tcp any any -> any any (msg: "test1"; sid:16521; rev:1;created_at 2010_07_30, updated_at 2016_07_01;)
alert tcp any any -> any any (msg: "test2"; nocase; sid :23476;distance:0; rev:1;created_at 2010_10_30, updated_at 2013_07_11;)
alert tcp any any -> any any (msg: "test3"; sid:236487; file_data; content:"clsid"; nocase; distance:0; created_at 2008_08_03, updated_at 2016_05_01;
我想从文件&中提取sid,msg,created_at和updated_at.输出应类似于
I want to extract sid,msg,created_at and updated_at from the file & output should looks like
test1 | 16521 | 2010_07_30 | 2016_07_01
test2 | 23476 | 2010_10_30 | 2013_07_11
test3 | 236487| 2008_08_03 | 2016_05_01
而我使用的脚本是
cat $file | grep -v "^#" | grep "^alert" | sed 's/\"//g' | awk -F ';' '
{
for(i=1;i<=NF;i++)
{
if (match($i,"sid:")>0)
{
split($i, array1, ":")
Rule_sid=array1[2]
}
if(match($i,"msg:")>0)
{
split($i, array, "(")
split(array[2], array2, ":")
message=array2[2]
}
if(match($i,/metadata:/)>0 )
{
split($i, array3,/created_at/)
create_date=array3[2]
}
if(match($i,/metadata:/)>0 )
{
split($i, array4, ", updated_at ")
update_date=array4[2]
}
}
print Rule_sid "|" message "|" create_date "|" update_date
}' >> Rule_Files/$file
推荐答案
使用awk
根据您的兴趣修改-v OFS=" | "
和-v extract="msg,sid,created_at,updated_at"
,OFS
是输出字段分隔符,变量extract
保存需要解析的字段列表(用逗号分隔),如果找不到任何字段,它将被解析给出Null
Modify -v OFS=" | "
and -v extract="msg,sid,created_at,updated_at"
as per your interest, OFS
is output field separator and variable extract
holds list of fields (separated by comma) which need to be parsed, if any field not found it will give Null
程序假定字段值存在于当前字段匹配项旁边,假设在j=4
时找到了字段sid
,其值存在于j+1
,即j=5
.
Program assumes field value exists next to current field match, suppose field sid
found when j=4
, its value exists at j+1
that is at j=5
.
输入
$ cat file
alert tcp any any -> any any (msg: "test1"; sid:16521; rev:1;created_at 2010_07_30, updated_at 2016_07_01;)
alert tcp any any -> any any (msg: "test2"; nocase; sid :23476;distance:0; rev:1;created_at 2010_10_30, updated_at 2013_07_11;)
alert tcp any any -> any any (msg: "test3"; sid:236487; file_data; content:"clsid"; nocase; distance:0; created_at 2008_08_03, updated_at 2016_05_01;)
输出
$ awk -v OFS=" | " -v extract="msg,sid,created_at,updated_at" '
BEGIN{
split(extract,Fields,/,/)
}
{
gsub(/[:";,()]/," ");
s="";
for(i=1; i in Fields; i++)
{
f = 1
for(j=1; j<=NF; j++)
{
if($j==Fields[i])
{
f = 0
s = ( s ? s OFS :"") $(j+1)
break
}
}
if(f){
s = (s ? s OFS:"") "Null"
}
}
print s
}' file
test1 | 16521 | 2010_07_30 | 2016_07_01
test2 | 23476 | 2010_10_30 | 2013_07_11
test3 | 236487 | 2008_08_03 | 2016_05_01
这篇关于从Shell中的文件中提取字段的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!