编写用于syslog-ng的自定义模板/解析器/过滤器 [英] writing a custom template/parser/filter for use in syslog-ng
问题描述
我的应用程序生成日志并将其发送到syslog-ng. 我想编写一个用于syslog-ng的自定义模板/解析器/过滤器,以将字段正确存储在SQLite数据库(MyDatabase)的表中.
My application generates logs and sends them to syslog-ng. I want to write a custom template/parser/filter for use in syslog-ng to correctly store the fields in tables of an SQLite database (MyDatabase).
这是我的日志的传说:
unique-record-id usename date Quantity BOQ possible,item,profiles Count Vendor applicable,vendor,categories known,request,types vendor_code credit
所有这12个字段都用制表符分隔,并且解析器必须将它们存储到MyDatabase中表MyTable1的12列中. 其中的某些字段:第6,第9和第10个字段也包含子字段"作为逗号分隔的值. 这些子字段中的每个子字段中的值数量是可变的,并且可以在日志的每一行中更改.
All these 12 fields are tab separated, and the parser must store them into 12 columns of table MyTable1 in MyDatabase. Some of the fields: the 6th, 9th, and 10th however also contain "sub-fields" as comma-separated values. The number of values within each of these sub-fields, is variable, and can change in each line of log.
我需要将这些字段存储在各个单独的表中 MyItem_type,MyVendor_groups,MyReqs
I need these fields to be stored in respective separate tables MyItem_type, MyVendor_groups, MyReqs
这些辅助"表具有3列,分别记录唯一记录ID和数量(针对它们在日志中的出现次数) 因此,MyItem_type表中的架构如下所示:
These "secondary" tables have 3 columns, record the Unique-Record-ID, and Quantity against each of their occurence in the log So the schema in MyItem_type table looks like:
Unique-Record-ID | item_profile | Quantity
类似地,MyVendor_groups的架构如下:
Similarly the schema of MyVendor_groups looks like:
Unique-Record-ID | vendor_category | Quantity
MyReqs的架构如下:
and the schema of MyReqs looks like:
Unique-Record-ID | req_type | Quantity
从日志中考虑以下示例行:
Consider these sample lines from the log:
唯一记录ID用户名日期数量可能的BOQ,项目,配置文件计数供应商适用,供应商,已知类别,请求,类型供应商代码信用
unique-record-id usename date Quantity BOQ possible,item,profiles Count Vendor applicable,vendor,categories known,request,types vendor_code credit
234.44.tfhj Sam 22-03-2016 22 prod1 cat1,cat22,cat36,cat44 66 ven1 t1,t33,t43,t49 req1,req2,req3,req4 blue 64.22
234.45.tfhj Alex 23-03-2016 100 prod2 cat10,cat36,cat42 104 ven1 t22,t45 req1,req2,req33,req5 red 66
234.44.tfhj Vikas 24-03-2016 88 prod1 cat101,cat316,cat43 22 ven2 t22,t43 req1,req23,req3,req6 red 77.12
234.47.tfhj Jane 25-03-2016 22 prod7 cat10,cat36,cat44 43 ven3 t77 req1,req24,req3,req7 green 45.89
234.48.tfhj John 26-03-2016 97 serv3 cat101,cat36,cat45 69 ven5 t1 req11,req2,req3,req8 orange 33.04
234.49.tfhj Ruby 27-03-2016 85 prod58 cat10,cat38,cat46 88 ven9 t33,t55,t99 req1,req24,req3,req9 white 46.04
234.50.tfhj Ahmed 28-03-2016 44 serv7 cat110,cat36,cat47 34 ven11 t22,t43,t77 req1,req20,req3,req10 red 43
我的解析器应将上面的日志存储为MyDatabase.Mytable1作为:
My parser should store the above log into MyDatabase.Mytable1 as:
unique-record-id | usename | date | Quantity | BOQ | item_profile | Count | Vendor | vendor_category | req_type | vendor_code | credit
234.44.tfhj | Sam | 22-03-2016 | 22 | prod1 | cat1,cat22,cat36,cat44 | 66 | ven1 | t1,t33,t43,t49 | req1,req2,req3,req4 | blue | 64.22
234.45.tfhj | Alex | 23-03-2016 | 100 | prod2 | cat10,cat36,cat42 | 104 | ven1 | t22,t45 | req1,req2,req33,req5 | red | 66
234.44.tfhj | Vikas | 24-03-2016 | 88 | prod1 | cat101,cat316,cat43 | 22 | ven2 | t22,t43 | req1,req23,req3,req6 | red | 77.12
234.47.tfhj | Jane | 25-03-2016 | 22 | prod7 | cat10,cat36,cat44 | 43 | ven3 | t77 | req1,req24,req3,req7 | green | 45.89
234.48.tfhj | John | 26-03-2016 | 97 | serv3 | cat101,cat36,cat45 | 69 | ven5 | t1 | req11,req2,req3,req8 | orange | 33.04
234.49.tfhj | Ruby | 27-03-2016 | 85 | prod58 | cat10,cat38,cat46 | 88 | ven9 | t33,t55,t99 | req1,req24,req3,req9 | white | 46.04
234.50.tfhj | Ahmed | 28-03-2016 | 44 | serv7 | cat110,cat36,cat47 | 34 | ven11 | t22,t43,t77 | req1,req20,req3,req10 | red | 43
并且还将可能的,项目,配置文件"解析为MyDatabase.MyItem_type记录为:
And also parse the "possible,item,profiles" to record into MyDatabase.MyItem_type as:
Unique-Record-ID | item_profile | Quantity
234.44.tfhj | cat1 | 22
234.44.tfhj | cat22 | 22
234.44.tfhj | cat36 | 22
234.44.tfhj | cat44 | 22
234.45.tfhj | cat10 | 100
234.45.tfhj | cat36 | 100
234.45.tfhj | cat42 | 100
234.44.tfhj | cat101 | 88
234.44.tfhj | cat316 | 88
234.44.tfhj | cat43 | 88
234.47.tfhj | cat10 | 22
234.47.tfhj | cat36 | 22
234.47.tfhj | cat44 | 22
234.48.tfhj | cat101 | 97
234.48.tfhj | cat36 | 97
234.48.tfhj | cat45 | 97
234.48.tfhj | cat101 | 97
234.48.tfhj | cat36 | 97
234.48.tfhj | cat45 | 97
234.49.tfhj | cat10 | 85
234.49.tfhj | cat38 | 85
234.49.tfhj | cat46 | 85
234.50.tfhj | cat110 | 44
234.50.tfhj | cat36 | 44
234.50.tfhj | cat47 | 44
我们还需要类似地解析适用,供应商,类别",然后 将它们存储到MyDatabase.MyVendor_groups中.并解析 用于存储到MyDatabase.MyReqs中的已知,请求,类型" MyDatabase.MyItem_type,MyDatabase.MyVendor_groups和 MyDatabase.MyReqs将始终是以前的唯一记录ID 在日志中见证.
We also need to similarly parse "applicable,vendor,categories" and store them into MyDatabase.MyVendor_groups. And parse "known,request,types" for storage into MyDatabase.MyReqs The first column for MyDatabase.MyItem_type, MyDatabase.MyVendor_groups and MyDatabase.MyReqs will always be the Unique-Record-ID that was witnessed in the log.
因此,是的,此列在这三个表中不像其他列一样包含唯一数据. 第三列将始终是日志中见证的数量.
Therefore yes, this column does not contain unique data, like other columns, in these three tables. The third column will always be the Quantity that was witnessed in the log.
我知道一些PCRE,但是在syslog-ng中使用嵌套解析器使我完全困惑.
I know a bit of PCRE, but it is the use of nested parsers in syslog-ng that's completely confusing me.
Syslog-ng的文档表明这是可能的,但是根本无法得到一个很好的例子.如果此处有任何骇客可以分享一些参考或示例,它将非常有用.
Documentation of Syslog-ng suggests this is possible, but simply failed to get a good example. If any kind hack around here has some reference or sample to share, it will be so useful.
谢谢.
推荐答案
我认为所有这些操作都可以通过使用csv-parser几次来完成. 首先,使用带有制表符定界符("\ t")的csv-parser将初始字段拆分为命名列.在整个消息上使用此解析器. 然后,您必须在需要进一步分析的列上使用csv-parser的其他实例来分析具有子字段的字段. 您可以在 https://www.balabit.com/sites/default/files/documents/syslog-ng- ose-latest-guides/zh-CN/syslog-ng-ose-guide-admin/html/reference-parsers-csv.html
I think all of these can be done using the csv-parser a few times. First, use a csv-parser with the tab delimiter("\t") to split the initial fields into named columns. Use this parser on the entire message. Then you'll have to parse the fields that have subfields using other instances of the csv-parser on the columns that need further parsing. You can find some examples at https://www.balabit.com/sites/default/files/documents/syslog-ng-ose-latest-guides/en/syslog-ng-ose-guide-admin/html/csv-parser.html and https://www.balabit.com/sites/default/files/documents/syslog-ng-ose-latest-guides/en/syslog-ng-ose-guide-admin/html/reference-parsers-csv.html
(如果您同时指定制表符和逗号作为分隔符,则可以用单个解析器完成它,但是对于可变字段数的字段可能不起作用.)
(It is possible that you can get it done with a single parser, if you specify both the tab and the comma as delimiters, but it might not work for the fields with variable number of fields.).
这篇关于编写用于syslog-ng的自定义模板/解析器/过滤器的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!