如何在Python中从Elasticsearch获取所有结果 [英] How to Get All Results from Elasticsearch in Python

查看:235
本文介绍了如何在Python中从Elasticsearch获取所有结果的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我是使用Elasticsearch的新手,在通过Python脚本运行Elasticsearch查询时,我无法将所有结果返回.我的目标是查询索引(以下为"my_index"),然后将这些结果放入一个通过Django应用程序并最终生成Word文档的pandas DataFrame中.

I am brand new to using Elasticsearch and I'm having an issue getting all results back when I run an Elasticsearch query through my Python script. My goal is to query an index ("my_index" below), take those results, and put them into a pandas DataFrame which goes through a Django app and eventually ends up in a Word document.

我的代码是:

es = Elasticsearch()
logs_index = "my_index"
logs = es.search(index=logs_index,body=my_query)

它告诉我我有72次点击,但是当我这样做时:

and it tells me I have 72 hits, but then when I do:

df = logs['hits']['hits']
len(df)

它说长度只有10.我看到有人在这个问题,但是他们的解决方案对我不起作用.

It says the length is only 10. I saw someone had a similar issue on this question but their solution did not work for me.

from elasticsearch import Elasticsearch
from elasticsearch_dsl import Search
es = Elasticsearch()
logs_index = "my_index"
search = Search(using=es)
total = search.count()
search = search[0:total]
logs = es.search(index=logs_index,body=my_query)
len(logs['hits']['hits'])

len函数仍然显示我只有10个结果.我做错了什么,还是可以采取其他措施来恢复全部72个结果?

The len function still says I only have 10 results. What am I doing wrong, or what else can I do to get all 72 results back?

ETA:我知道我可以在查询中添加"size":10000,以防止截断到仅10,但是由于用户将要输入搜索查询,因此我需要找到另一种方法只是在搜索查询中.

ETA: I am aware that I can just add "size": 10000 to my query to stop it from truncating to just 10, but since the user will be entering their search query I need to find another way that isn't just in the search query.

推荐答案

您需要将 size 参数传递给您的 es.search()调用.

You need to pass a size parameter to your es.search() call.

请阅读 API文档

size –要返回的点击数(默认值:10)

size – Number of hits to return (default: 10)

一个例子:

es.search(index=logs_index, body=my_query, size=1000)

请注意,这不是获取所有索引文档或返回大量文档的查询的最佳方法.为此,您应该执行 scroll 操作,该操作也记录在

Please note that this is not an optimal way to get all index documents or a query that returns a lot of documents. For that you should do a scroll operation which is also documented in the API Docs provided under the scan() abstraction for scroll Elastic Operation.

您还可以在 elasticsearch中阅读有关此内容的信息.文档

这篇关于如何在Python中从Elasticsearch获取所有结果的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆