如何在Python中从Elasticsearch获取所有结果 [英] How to Get All Results from Elasticsearch in Python
问题描述
我是使用Elasticsearch的新手,在通过Python脚本运行Elasticsearch查询时,我无法将所有结果返回.我的目标是查询索引(以下为"my_index"),然后将这些结果放入一个通过Django应用程序并最终生成Word文档的pandas DataFrame中.
I am brand new to using Elasticsearch and I'm having an issue getting all results back when I run an Elasticsearch query through my Python script. My goal is to query an index ("my_index" below), take those results, and put them into a pandas DataFrame which goes through a Django app and eventually ends up in a Word document.
我的代码是:
es = Elasticsearch()
logs_index = "my_index"
logs = es.search(index=logs_index,body=my_query)
它告诉我我有72次点击,但是当我这样做时:
and it tells me I have 72 hits, but then when I do:
df = logs['hits']['hits']
len(df)
它说长度只有10.我看到有人在这个问题,但是他们的解决方案对我不起作用.
It says the length is only 10. I saw someone had a similar issue on this question but their solution did not work for me.
from elasticsearch import Elasticsearch
from elasticsearch_dsl import Search
es = Elasticsearch()
logs_index = "my_index"
search = Search(using=es)
total = search.count()
search = search[0:total]
logs = es.search(index=logs_index,body=my_query)
len(logs['hits']['hits'])
len函数仍然显示我只有10个结果.我做错了什么,还是可以采取其他措施来恢复全部72个结果?
The len function still says I only have 10 results. What am I doing wrong, or what else can I do to get all 72 results back?
ETA:我知道我可以在查询中添加"size":10000,以防止截断到仅10,但是由于用户将要输入搜索查询,因此我需要找到另一种方法只是在搜索查询中.
ETA: I am aware that I can just add "size": 10000 to my query to stop it from truncating to just 10, but since the user will be entering their search query I need to find another way that isn't just in the search query.
推荐答案
您需要将 size
参数传递给您的 es.search()
调用.
You need to pass a size
parameter to your es.search()
call.
请阅读 API文档
size –要返回的点击数(默认值:10)
size – Number of hits to return (default: 10)
一个例子:
es.search(index=logs_index, body=my_query, size=1000)
请注意,这不是获取所有索引文档或返回大量文档的查询的最佳方法.为此,您应该执行 scroll
操作,该操作也记录在
Please note that this is not an optimal way to get all index documents or a query that returns a lot of documents. For that you should do a scroll
operation which is also documented in the API Docs provided under the scan() abstraction for scroll
Elastic Operation.
您还可以在 elasticsearch中阅读有关此内容的信息.文档
这篇关于如何在Python中从Elasticsearch获取所有结果的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!