如何将mysql数据库同步到外部数据源 [英] How to sync a mysql database to external data source

查看:210
本文介绍了如何将mysql数据库同步到外部数据源的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个名为搜索的mysql数据库表,我需要使用ElasticSearch索引来跟踪数据。我已经将表从表中导出到es索引,但是现在我需要保持数据保持同步,否则搜索将变得非常快速。

I have a mysql database table called search that I need to keep up to data with an ElasticSearch index. I have already exported the table from the table to the es index, but now I need to keep the data in sync or else the search will become stale quite quickly.

唯一的方法我可以想到是通过导出表每x分钟,然后将其与最后导入的内容进行比较。这是不可行的,因为表有大约10M行,我不想每隔五分钟做表格导出。这将是一个很好的解决方案?请注意,我只有对数据库的读取权限。

The only way I can think of is by exporting the table every x minutes and then comparing it with what was last imported. This isn't feasible since the table has about 10M rows and I don't want to be doing table exports every five minutes all day long. What would be a good solution for this? Note that I only have read-access to the database.

推荐答案

我将利用Logstash与 jdbc 输入插件和 elasticsearch 输出插件。 博客文章显示了此解决方案的完整示例。

I would leverage Logstash with a jdbc input plugin and an elasticsearch output plugin. There's a blog article showing a full example of this solution.

之后安装Logstash ,您可以使用我上面提到的插件创建一个配置文件:

After installing Logstash, you can create a configuration file with the plugins I mentioned above like this:

input {
    jdbc {
        jdbc_connection_string => "jdbc:mysql://localhost:3306/mydb"
        jdbc_user => "user"
        jdbc_password => "1234"
        jdbc_validate_connection => true
        jdbc_driver_library => "mysql-connector-java-5.1.36-bin.jar"
        jdbc_driver_class => "com.mysql.jdbc.Driver"
        schedule => "5m"
        statement => "SELECT * FROM search WHERE timestamp > :sql_last_value"
    }
}
output {
    elasticsearch {
        protocol => http
        index => "searches"
        document_type => "search"
        document_id => "%{uid}"
        host => "ES_NODE_HOST"
    }
}

您需要确保更改几乎没有什么价值来匹配你的环境,但是这应该是没有问题的,你需要做什么。

You need to make sure to change a few values to match your environment, but this should work out without a problem for what you need to do.

查询将每5分钟运行一次,并将获取所有搜索记录,其 timestamp (更改该名称以匹配您的数据)比上次查询运行时更新。选定的记录将在 ES_NODE_HOST 中的您的Elasticsearch服务器中的搜索索引中。确保相应地更改索引和类型名称,以及主键字段的名称(即 uid )以匹配您的数据。

Every 5 minutes the query will run and will fetch all search records whose timestamp (change that name to match your data) is more recent than the last time the query ran. The selected records will be sinked in the searches index located in your Elasticsearch server on ES_NODE_HOST. Make sure to change the index and type name accordingly, as well as the name of the primary key field (i.e. uid) to match your data as well.

这篇关于如何将mysql数据库同步到外部数据源的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆