使用过滤器更改logstash中的输入数据 [英] Changing the input data in logstash using a filter

查看:164
本文介绍了使用过滤器更改logstash中的输入数据的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个表的输入数据。表格数据看起来像
< Customer_id> < ITEM_ID> < Item name> 。对于客户带来的每件物品,桌子上都有一排。例如,如果c1购买i1,i2,i3,i4,i5表中将有5行。



现在我想插入到弹性搜索中的数据是这样的:

 code> {
c1:[
{
item_id:i1,
item_name:ABC
},
{
item_id:i2,
item_name:XYZ
},
.....
],
c2:[
{
item_id:4,
item_name:PQR
}
]
}

如何在logstash中修改上述输入?



另外我的模式如下所示:



项目:
item_id,item_name



购买:
cust_id,item_id



还可以建议SQL查询为了获得上述输出吗?

解决方案

我的方法是通过创建一个SQL查询,将 Customer_ID 一起使用 GROUP_CONCAT 收集组中的所有项目。



然后,您可以使用 logstash jdbc输入与你上面提到的SQL查询,你应该是好的。



更新



我已经对您的SQL查询进行了一些修改:

  SELECT CONCAT('{',cust_id,':[',GROUP_CONCAT(CONCAT('{item_id :',buy.item_id,','),CONCAT('item_name:',item.item_name,'}')),']}')
FROM item,buy
WHERE buy.item_id = item.item_id
GROUP BY cust_id

其中生成这样的行,这是非常接近你所需要的:

  {1:[{item_id:1,item_name abc},{item_id:2,item_name:xyz}]} 
{2:[{item_id:4,item_name:pqr}]}


I have my input data from a table. The table data looks like <Customer_id> <Item_id> <Item name>. For every item brought by customer, there is a separate row in the table. For example, if c1 buys i1,i2,i3,i4,i5 It will have 5 rows in the table.

Now the data that I want to insert into elasticsearch is in this some way:

{
  "c1": [
    {
      "item_id": "i1",
      "item_name": "ABC"
    },
    {
      "item_id": "i2",
      "item_name": "XYZ"
    },
    .....
  ],
  "c2": [
    {
      "item_id": 4,
      "item_name": "PQR"
    }
  ]
}

How can I modify the input as above in logstash ?

Also my schema looks like this :

Item : item_id , item_name

Buy: cust_id, item_id

Also Can you please suggest the SQL query to be made in order to get the above output?

解决方案

The way I would approach this is by creating an SQL query that groups those rows on Customer_ID together and uses GROUP_CONCAT to gather all items of the group.

Then, you can use the logstash jdbc input with the SQL query you came up with above and you should be good.

UPDATE

I've reworked your SQL query a little bit like this:

SELECT CONCAT('{"',cust_id,'": [',GROUP_CONCAT(CONCAT('{"item_id":',buy.item_id,','),CONCAT('"item_name": "',item.item_name,'"}')), ']}') 
FROM item, buy
WHERE buy.item_id = item.item_id 
GROUP BY cust_id

which produces rows like this, which are pretty close to what you need:

{"1": [{"item_id":1,"item_name": "abc"},{"item_id":2,"item_name": "xyz"}]}
{"2": [{"item_id":4,"item_name": "pqr"}]}

这篇关于使用过滤器更改logstash中的输入数据的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆