Elasticsearch的最大滚动时间 [英] Max scrollable time for elasticsearch

查看:195
本文介绍了Elasticsearch的最大滚动时间的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

可以为滚动搜索设置的最大可滚动时间是什么?



文档:
https://www.elastic.co/guide/en/elasticsearch/ client / javascript-api / current / api-reference.html#api-scroll

解决方案

没有人-value-适合所有最大滚动时间的值。



扫描&滚动旨在以块的形式扫描大量记录。每个块的最大值必须通过增量增加来获得,直到达到突破为止,因为它取决于集群资源,网络延迟和集群负载。



具有约10亿条记录和1 TB数据的3节点测试设置。我能够滚动整个索引,滚动大小为5000,超时为5m。但是,使用这些值会有很多超时。根据我们的分析,我们观察到滚动超时在很大程度上取决于集群负载网络延迟。因此,我们最终确定了3500个大小和4m的超时时间。



所以我建议以下内容- $ b

  • 逐渐增加大小和超时值,以获取网络的最大值。

  • 一旦有了最大值,请将其减小一个档位,以适应群集导致的故障负载和延迟时间


  • What is the max scrollable time that can be set for scrolling search ?

    Documentation: https://www.elastic.co/guide/en/elasticsearch/client/javascript-api/current/api-reference.html#api-scroll

    解决方案

    There is no one-value-fits-all value of max scroll time.

    Scan & Scroll is meant to scan through a large number of records in chunks. The max value for each chunk has to be obtained by incremental increases till you hit the breaking as it depends on your cluster resources,network latency and cluster load.

    We had a 3 node test setup with about 1 billion records and 1 TB of data. I was able to scroll through the entire index with scroll size 5000 and timeout 5m. However, there were lots of timeouts with those values. From our analysis,we observered that scroll timeouts were heavily dependent on cluster load and network latency. So we finally settled on 3500 size and 4m timeout.

    So i would recomend the following-

    • Incrementally increase the size and timeout values to get the max value for your network.
    • Once you have the max value, reduce it a notch to accommodate for failures due to cluster load & latency

    这篇关于Elasticsearch的最大滚动时间的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

    查看全文
    登录 关闭
    扫码关注1秒登录
    发送“验证码”获取 | 15天全站免登陆