有没有更聪明的方式来索引弹性搜索? [英] Is there a smarter way to reindex elasticsearch?
问题描述
是否有
我觉得@karmi是正确的。但是让我解释一下一点。我需要偶尔使用一些新的属性或分析设置升级生产模式。
我最近开始使用下面描述的场景进行实时,恒定负载,零停机索引迁移。您可以远程执行。
以下是步骤:
假设:
- 您的索引
real1
和别名real_write
,real_read
指向它, - 客户端只写入
real_write
,只读取real_read
, -
_source
文件属性可用。
1。新索引
使用您选择的新映射和设置创建 real2
索引。
2。 Writer别名开关
使用以下批量查询切换写入别名。
curl -XPOST'http:// esserver:9200 / _aliases'-d'
{
actions:[
{remove:{index:real1,别名:real_write}},
{add:{index:real2,alias:real_write}}
]
}'
这是原子操作。从这时起,$ code> real2 在所有节点上填充新的客户端数据。读者仍然通过 real_read
使用旧的 real1
。这是最终的一致性。
3。旧数据迁移
数据必须从 real1
迁移到 real2
,但是, real2
中的新文档不能被旧条目覆盖。迁移脚本应该使用批量
API与创建
操作(而不是索引
或更新
)。我使用简单的Ruby脚本 es-reindex ,它具有很好的E.T.A.状态:
$ ruby es-reindex.rb http:// esserver:9200 / real1 http:// esserver:9200 / real2
4。阅读器别名开关
现在 real2
是最新的,客户端正在写信给他们,但是他们仍在阅读 real1
。让我们更新读者别名:
curl -XPOST'http:// esserver:9200 / _aliases'-d'
{
actions:[
{remove:{index:real1,alias:real_read}},
{add:{index :real2,alias:real_read}}
]
}'
5。备份和删除旧索引
写入和读取转到 real2
。您可以从ES群集备份和删除 real1
索引。
完成!
I ask because our search is in a state of flux as we work things out, but each time we make a change to the index (change tokenizer or filter, or number of shards/replicas), we have to blow away the entire index and re-index all our Rails models back into Elasticsearch ... this means we have to factor in downtime to re-index all our records.
Is there a smarter way to do this that I'm not aware of?
I think @karmi makes it right. However let me explain it a bit simpler. I needed to occasionally upgrade production schema with some new properties or analysis settings. I recently started to use the scenario described below to do live, constant load, zero-downtime index migrations. You can do that remotely.
Here are steps:
Assumptions:
- You have index
real1
and aliasesreal_write
,real_read
pointing to it, - the client writes only to
real_write
and reads only fromreal_read
, _source
property of document is available.
1. New index
Create real2
index with new mapping and settings of your choice.
2. Writer alias switch
Using following bulk query switch write alias.
curl -XPOST 'http://esserver:9200/_aliases' -d '
{
"actions" : [
{ "remove" : { "index" : "real1", "alias" : "real_write" } },
{ "add" : { "index" : "real2", "alias" : "real_write" } }
]
}'
This is atomic operation. From this time real2
is populated with new client's data on all nodes. Readers still use old real1
via real_read
. This is eventual consistency.
3. Old data migration
Data must be migrated from real1
to real2
, however new documents in real2
can't be overwritten with old entries. Migrating script should use bulk
API with create
operation (not index
or update
). I use simple Ruby script es-reindex which has nice E.T.A. status:
$ ruby es-reindex.rb http://esserver:9200/real1 http://esserver:9200/real2
4. Reader alias switch
Now real2
is up to date and clients are writing to it, however they are still reading from real1
. Let's update reader alias:
curl -XPOST 'http://esserver:9200/_aliases' -d '
{
"actions" : [
{ "remove" : { "index" : "real1", "alias" : "real_read" } },
{ "add" : { "index" : "real2", "alias" : "real_read" } }
]
}'
5. Backup and delete old index
Writes and reads go to real2
. You can backup and delete real1
index from ES cluster.
Done!
这篇关于有没有更聪明的方式来索引弹性搜索?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!