通过批量 API、扫描和滚动重新索引弹性搜索 [英] Reindexing Elastic search via Bulk API, scan and scroll

查看:30
本文介绍了通过批量 API、扫描和滚动重新索引弹性搜索的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试重新索引我的 Elastic 搜索设置,目前正在查看 Elastic 搜索文档使用 Python API 的示例

I am trying to re-index my Elastic search setup, currently looking at the Elastic search documentation and an example using the Python API

我对这一切是如何运作的有点困惑.我能够从 Python API 获取滚动 ID:

I'm a little bit confused as to how this all works though. I was able to obtain the scroll ID from the Python API:

es = Elasticsearch("myhost")

index = "myindex"
query = {"query":{"match_all":{}}}
response = es.search(index= index, doc_type= "my-doc-type", body= query, search_type= "scan", scroll= "10m")

scroll_id = response["_scroll_id"]

现在我的问题是,这对我有什么用?知道滚动 ID 还能给我什么?文档说要使用批量 API",但我不知道 scoll_id 是如何影响这个的,这有点令人困惑.

Now my question is, what use is this to me? What does knowing the scrolling id even give me? The documentation says to use the "Bulk API" but I have no idea how the scoll_id factors into this, it was a little confusing.

考虑到我已经正确获得了 scroll_id,谁能举一个简短的例子来展示我如何从这一点重新索引?

Could anyone give a brief example showing my how to re-index from this point, considering that I've got the scroll_id correctly?

推荐答案

这里是一个使用 elasticsearch-py 重新索引到另一个 elasticsearch 节点的示例:

here is an example of reindexing to another elasticsearch node using elasticsearch-py:

from elasticsearch import helpers
es_src = Elasticsearch(["host"])
es_des = Elasticsearch(["host"])

helpers.reindex(es_src, 'src_index_name', 'des_index_name', target_client=es_des)

您还可以将查询结果重新索引到不同的索引,方法如下:

you can also reindex the result of a query to a different index here is how to do it:

from elasticsearch import helpers
es_src = Elasticsearch(["host"])
es_des = Elasticsearch(["host"])

body = {"query": {"term": {"year": "2004"}}}
helpers.reindex(es_src, 'src_index_name', 'des_index_name', target_client=es_des, query=body)

这篇关于通过批量 API、扫描和滚动重新索引弹性搜索的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆