logstash jdbc 连接器基于时间的数据 [英] logstash jdbc connector time based data

查看:29
本文介绍了logstash jdbc 连接器基于时间的数据的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在这里使用新的 logstash jdbc 连接器:

With the new logstash jdbc connector here:

https://www.elastic.co/指南/en/logstash/current/plugins-inputs-jdbc.html后续的 logstash 运行如何影响已编入 ElasticSearch 的内容?它是在 ES 索引中创建新文档,还是更新与已被索引的行匹配的文档?我尝试解决的用例是将带有时间戳的行索引到弹性搜索中,但表不断更新我只想索引新数据,或者如果我必须再次读取表,只为新添加新文档行.

https://www.elastic.co/guide/en/logstash/current/plugins-inputs-jdbc.html How do subsequent logstash runs effect whats already indexed into ElasticSearch? Does it create new documents in the ES index, or does it update the docs that match the row that have already been indexes? The use case I'm try to tackle is to index rows with timestamps into elastic search, but the table continually gets updated i would like to only index new data, or if I have to read the table again, only add new documents for new rows.

有什么建议吗?或者更多关于 logstash jdbc 插件的文档?

Any suggestions? Or more documentation around the logstash jdbc plugin?

推荐答案

我要做的是在查询语句中包含上次运行插件的时间戳(即 sql_last_start) 所以它只会加载新更新的记录.

What I would do is to include in the query statement the timestamp of the last time the plugin ran (i.e. sql_last_start) and so it will only load the newly updated records.

例如,您的 jdbc 输入插件将如下所示:

For instance, your jdbc input plugin would look like this:

input {
  jdbc {
    jdbc_driver_library => "mysql-connector-java-5.1.36-bin.jar"
    jdbc_driver_class => "com.mysql.jdbc.Driver"
    jdbc_connection_string => "jdbc:mysql://localhost:3306/mydb"
    jdbc_user => "mysql"
    schedule => "* * * * *"
    statement => "SELECT * FROM mytable WHERE timestamp > :sql_last_start"
  }
}

确保将 timestamp 更改为包含上次更新日期的字段名称,将 mytable 更改为表的真实名称.

Make sure to change timestamp with the name of your field containing the last updated date and mytable with the real name of your table.

这篇关于logstash jdbc 连接器基于时间的数据的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆