logstash jdbc连接器基于时间的数据 [英] logstash jdbc connector time based data

查看:349
本文介绍了logstash jdbc连接器基于时间的数据的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在这里使用新的logstash jdbc连接器:

With the new logstash jdbc connector here:

https://www.elastic.co/guide/en/logstash/current/plugins-inputs-jdbc.html
后续的logstash如何运行效果已经编入ElasticSearch的内容?它是否在ES索引中创建新文档,还是更新与已经被索引的行匹配的文档?我试图解决的用例是将带有时间戳的行索引到弹性搜索中,但是表不断更新,我只想索引新数据,或者如果我必须再次读取表,则只能添加新的文档行。

https://www.elastic.co/guide/en/logstash/current/plugins-inputs-jdbc.html How do subsequent logstash runs effect whats already indexed into ElasticSearch? Does it create new documents in the ES index, or does it update the docs that match the row that have already been indexes? The use case I'm try to tackle is to index rows with timestamps into elastic search, but the table continually gets updated i would like to only index new data, or if I have to read the table again, only add new documents for new rows.

任何建议?或者更多关于logstash jdbc插件的文档?

Any suggestions? Or more documentation around the logstash jdbc plugin?

推荐答案

我将要做的是在查询语句中包含最后一个时间插件运行(即 sql_last_start ),所以它只会加载新更新的记录。

What I would do is to include in the query statement the timestamp of the last time the plugin ran (i.e. sql_last_start) and so it will only load the newly updated records.

例如,您的 jdbc 输入插件将如下所示:

For instance, your jdbc input plugin would look like this:

input {
  jdbc {
    jdbc_driver_library => "mysql-connector-java-5.1.36-bin.jar"
    jdbc_driver_class => "com.mysql.jdbc.Driver"
    jdbc_connection_string => "jdbc:mysql://localhost:3306/mydb"
    jdbc_user => "mysql"
    schedule => "* * * * *"
    statement => "SELECT * FROM mytable WHERE timestamp > :sql_last_start"
  }
}

确保更改 timestamp ,其中包含最后更新日期的字段名称和 mytable 的真实姓名你的桌子。

Make sure to change timestamp with the name of your field containing the last updated date and mytable with the real name of your table.

这篇关于logstash jdbc连接器基于时间的数据的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆