Elasticsearch:在指定时间范围内滚动 [英] Elasticsearch: scroll between specified time frame

查看:80
本文介绍了Elasticsearch:在指定时间范围内滚动的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在Elasticsearch中有一些数据.如图所示

I have some data in elasticsearch. as shown in the image

我使用下面的链接示例进行滚动

I used below link example to do the scrolling

https://gist.github.com/drorata/146ce50807d16fd4a6aa

page = es.search(
    index = INDEX_NAME,
    scroll = '1m',
    size = 1000,
    body={"query": {"match_all": {}}})
    sid = page['_scroll_id']
    scroll_size = page['hits']['total']

    # Start scrolling

    print( "Scrolling...")
    while (scroll_size > 0):


        print("Page: ",count)
        page = es.scroll(scroll_id = sid, scroll = '10m')
        # Update the scroll ID
        sid = page['_scroll_id']

        for hit in page['hits']['hits']:
            #some code processing here

当前,我的要求是我要滚动但要指定开始时间戳记和结束时间戳记需要有关如何使用滚动条执行此操作的帮助.

Currently my requirement is that i want to scroll but want to specify the start timestamp and end timestamp Need help as to how to do this using scroll.

推荐答案

示例代码.时间范围应在es查询中.另外,您应该处理第一个查询结果.

example code. time range should be in es query. Also You should process the first query result.

es_query_dict = {"query": {"range": {"timestamp":{
    "gte":"2018-08-00T00:00:00Z", "lte":"2018-08-17T00:00:00Z"}}}}


def get_es_logs():
    es_client = Elasticsearch([source_es_ip], port=9200, timeout=300)

    total_docs = 0
    page = es_client.search(scroll=scroll_time,
                            size=scroll_size,
                            body=json.dumps(es_query_dict))
    while True:
        sid = page['_scroll_id']
        details = page["hits"]["hits"]
        doc_count = len(details)
        if len(details) > 0:
            total_docs += doc_count
            print("scroll size: " + str(doc_count))
            print("start bulk index docs")
            # index_bulk(details)
            print("end success")
        else:
            break
        page = es_client.scroll(scroll_id=sid, scroll=scroll_time)

    print("total docs: " + str(total_docs))

这篇关于Elasticsearch:在指定时间范围内滚动的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆