确保弹性搜索与数据库同步 [英] Ensuring ElasticSearch is in Sync with Database

查看:125
本文介绍了确保弹性搜索与数据库同步的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在考虑每天的脚本来执行以下操作,以便考虑到ES服务器上的更新有问题的任何情况(我还没有高可用性设置,即使如此,它也是在DB和ES之间数据复制的情况下,仍然可能是一个很好的做法。在把这个脚本放在一起之前,我以为我会检查一下正确的方法,以及我是否应该使用任何图书馆或者技术。

I'm considering a daily script to do the following, in order to account for any situations where there was a problem with updates on the ES server (I don't yet have a high-availability setup and even so, it's still probably a good practice in a situation where data is being duplicated between DB and ES). Before putting this script together, I thought I'd check if I'm going about this the right way, and whether there are any libraries or techniques I should use.

该脚本将简单地从数据库中检索所有ID,并从ElasticSearch中获取所有ID,其中 created_at< current_time (当前时间的快照,因为它是脚本运行时的移动目标)。

The script will simply retrieve all IDs from the database and all IDs from ElasticSearch, where created_at < current_time (a snapshot of the current time, since it's a moving target as the script runs). It will then add and remove to Elastic search based on the differences between these IDs sets.

这听起来像一个合理的方法吗?

Does this sound like a reasonable approach?

推荐答案

为了回答我的问题,这不是最好的方法。

To answer my question, this is not the best approach.

密集的方法是定期重建整个索引。当然,这在生产中很难做到,因为这会导致几分钟或几个小时的停机时间,所以诀窍是重建一个新的索引并切换到使用它。在ElasticSearch中,您无法重命名索引,但可以使用别名。

A simpler, if more resource-intensive, approach is to re-build the entire index periodically. Of course, this is difficult to do in production as it would cause minutes or hours of downtime, so the trick is to rebuild a new index and switch to using that. In ElasticSearch, you can't rename an index, but you can use aliases.

有关方法的讨论这里和轮胎用户的耙子任务这里

There's a discussion of the approach here and a rake task for Tire users here.

这篇关于确保弹性搜索与数据库同步的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆