Elasticsearch的Liquibase或Flyway数据库迁移替代方案 [英] Liquibase or Flyway database migration alternative for Elasticsearch

查看:266
本文介绍了Elasticsearch的Liquibase或Flyway数据库迁移替代方案的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我对ES很新。我一直在尝试搜索数据库迁移工具很长时间,我找不到一个。我想知道是否有人可以帮我指出正确的方向。

I am pretty new to ES. I have been trying to search for a db migration tool for long and I could not find one. I am wondering if anyone could help to point me to the right direction.

我将在项目中使用Elasticsearch作为主数据存储区。我想对我在项目中开发新模块时运行的所有映射和配置更改/数据导入/数据升级脚本进行版本化。

I would be using Elasticsearch as a primary datastore in my project. I would like to version all mapping and configuration changes / data import / data upgrades scripts which I run as I develop new modules in my project.

过去我使用过数据库版本工具,如Flyway或Liquibase。

In the past I used database versioning tools like Flyway or Liquibase.

我是否可以使用ES来实现类似的框架/脚本或方法?

Are there any frameworks / scripts or methods I could use with ES to achieve something similar ?

有没有人有任何使用脚本和运行迁移脚本至少升级脚本的经验。

Does anyone have any experience doing this by hand using scripts and run migration scripts at least upgrade scripts.

提前致谢!

推荐答案

从这个角度来看,ES有一个很大的局限性:

From this point of view/need, ES have a huge limitations:


  • 尽管有动态映射,ES 无架构但是架构密集型。如果此更改与现有文档冲突(实际上,如果任何文档具有新映射影响的非空字段,这将导致异常),则无法更改映射。

  • ES中的文档是不可变的:一旦你索引了一个,你就可以只检索/删除。围绕这个的语法糖是部分更新,这使得ES端的线程安全删除+索引(具有相同的id)

  • despite having dynamic mapping, ES is not schemaless but schema-intensive. Mappings cant be changed in case when this change conflicting with existing documents (practically, if any of documents have not-null field which new mapping affects, this will result in exception)
  • documents in ES is immutable: once you've indexed one, you can retrieve/delete in only. The syntactic sugar around this is partial update, which makes thread-safe delete + index (with same id) on ES side

什么这意味着你的问题?基本上,您无法拥有ES的经典迁移工具。这里有什么可以让你更容易使用ES:

What does that mean in context of your question? You, basically, can't have classic migration tools for ES. And here's what can make your work with ES easier:


  • 使用严格的映射(dynamic: strict和/或 index.mapper.dynamic:false ,看看 mapping docs )。这将保护您的索引/类型

  • use strict mapping ("dynamic": "strict" and/or index.mapper.dynamic: false, take a look at mapping docs). This will protect your indexes/types from


  • 意外动态映射错误类型

  • 获取显式如果您错过了数据映射关系中的某些错误,则会出现错误

您可以获取实际的ES映射并将其与您的数据模型。如果您的PL具有足够高的ES级别库,这应该非常简单

you can fetch actual ES mapping and compare it with your data models. If your PL have high enough level library for ES, this should be pretty easy

您可以利用索引别名

所以,有点经验。对我来说,目前合理的流程是:

So, a little bit of experience. For me, currently reasonable flow is this:


  • 所有数据结构在代码中描述为模型。这个模型实际上也提供了ORM抽象。

  • 索引/映射创建调用是简单模型的方法。

  • 每个索引都有别名(即 news )指向实际索引(即 news_index_ {revision} _ {date_created} )。

  • All data structures described as models in code. This models actually provide ORM abstraction too.
  • Index/mapping creation call is simple model's method.
  • Every index has alias (i.e. news) which points to actual index (i.e. news_index_{revision}_{date_created}).

每次部署代码时,


  1. 尝试放置模型(类型)映射。如果它完成没有错误,这意味着你已经

  1. Try to put model(type) mapping. If it's done w/o error, this means that you've either


  • 放置相同的映射

  • 将映射作为旧的超集(仅提供新字段,旧保持不变)

  • 没有文档在受新映射影响的字段中具有值

所有这些实际上意味着您可以使用您拥有的mappping /数据,只需像往常一样使用数据

All of this actually means that you're good to go with mappping/data you have, just work with data as always


  • 使用新映射创建新索引/类型(命名为 name_ {revision} _ {date}

  • 将别名重定向到新索引

  • 启动迁移代码,使< a href =https://www.elastic.co/guide/en/elasticsearch/reference/1.6/docs-bulk.html\"rel =noreferrer> bulk 快速重建索引的请求
    在重建索引期间,您可以通过别名安全地索引新文档。缺点是历史数据是部分的在重新索引期间可用。

  • create new index/type with new mapping (named like name_{revision}_{date}
  • redirect your alias to new index
  • fire up migration code that makes bulk requests for fast reindexing During this reindexing you can safely index new documents normally through the alias. The drawback is that historical data is partially available during reindexing.

这是经过生产测试的解决方案。围绕这种方法的注意事项:

This is production-tested solution. Caveats around such approach:


  • 如果您的读取请求需要一致的历史数据,则无法执行此操作

  • 您需要重新索引整个索引。如果每个索引有1种类型(可行的解决方案)那么它很好。但有时您需要多类型索引

  • 数据网络往返。有时可能会疼痛

总结一下:


  • 尝试在模型中保持良好的抽象,这总是有帮助

  • 尝试保持历史数据/字段陈旧。只需记住这个想法就可以构建代码,这比起初的声音更容易

  • 我强烈建议避免依赖利用ES实验工具的迁移工具。这些可以随时更改,例如 river - * 工具。

  • try to have good abstraction in your models, this always helps
  • try keeping historical data/fields stale. Just build your code with this idea in mind, that's easier than sounds at first
  • I strongly recommend to avoid relying on migration tools that leverage ES experimental tools. Those can be changed anytime, like river-* tools did.

这篇关于Elasticsearch的Liquibase或Flyway数据库迁移替代方案的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆